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1.  Introduction 


The  notion  of  program  correctness  is  fundamental  in  software 
engineering  and  is  of  increasing  importance  as  computers  are  used 
in  more  critical  applications.  Before  being  released  for  use, 
software  usually  undergoes  a  validation  process,  i.e.  an  act  of 
reasoning  in  favor  of  its  correctness.  There  exist  two  major 
validation  approaches.  The  first  is  testing,  and  consists  of 
executing  the  program  on  a  number  of  inputs  and  then  showing  that 
the  results  are  in  agreement  with  the  program  specification.  The 
alternative  validation  technique  is  program  verification  (or  a 
proof  of  program  correctness)  .  A  program  is  verified  by  apply¬ 
ing,  at  some  level  of  formality,  logical  and  mathematical  reason¬ 
ing  concerning  the  program  and  specification  in  order  to  certify 
their  consistency.  In  this  dissertation,  we  consider  the  verifi¬ 
cation  approach  to  the  program  validation  problem. 

Unfortunately,  a  proof  of  correctness  for  any  nontrivial 
program  is  a  task  that  may  be  fraught  with  many  difficulties. 
While  our  research  addresses  a  number  of  these  issues,  there  are 
others  which  we  choose  to  ignore.  For  one  thing,  we  assume  that 
the  program  specification  has  been  sufficiently  formalized  to 
make  a  rigorous  proof  of  correctness  possible.  In  practice,  this 
can  be  a  formidable  task  for  programs  which  are  intended  to  solve 
complex  problems.  Secondly,  we  will  restrict  our  attention  to 
procedureless,  sequential,  structured  programs,  we  do  not  con¬ 
sider,  therefore,  the  issues  involved  in  proving  programs  con¬ 
taining  global/local  variables,  GOTOs,  parameter  passing  or  con¬ 
currency. 

Although  our  primary  interest  in  studying  verification  is  as 
a  method  of  validating  programs,  research  in  this  field  often  has 
implications  for  other  areas  of  software  engineering.  For  one 
thing,  a  crucial  aspect  of  proving  the  correctness  of  a  program 
is  the  act  of  reading  and  understanding  its  behavior.  Any  tech¬ 
nique  or  methodology  which  facilitates  this  act  is  useful  not 
only  as  a  verification  tool,  but  also  as  an  aid  in  comprehending, 
documenting,  modifying  and  maintaining  programs.  Secondly, 
verification  research  can  provide  insight  into  what  constitutes  a 
"good”  program.  Many  practices  associated  with  "structured  pro¬ 
gramming,"  for  example,  are  aimed  at  producing  programs  which  can 
feasibly  be  proven  correct.  It  is  reasonable  to  expect  that 
research  into  program  characteristics  which  influence  their  veri¬ 
fiability  should  well  have  significant  implications  for  the  pro¬ 
gram  development  process. 

Our  approach  to  program  verification  is  based  largely  on  the 
work  in  [Mills  72,  Mills  75].  This  methodology  has  come  to  be 
known  as  functional  correctness  and  centers  around  rules  for 
proving  a  prime  program  (i.e.  basic  flow  chart  structure)  correct 
with  respect  to  a  specification  which  has  been  formulated  as  a 
mathematical  function.  The  goal  of  our  research  is  a  thorough 
study  of  a  number  of  key  issues  dealing  with  the  practical 


application  of  the  functional  correctness  methodology. 

A  straightforward  version  of  functional  correctness  based  on 
prime  program  decomposition  of  composite  programs  is  defined  in 
Chapter  2.  This  verification  strategy  is  then  compared  and  con¬ 
trasted  with  other  standard  verification  methodologies.  In  par¬ 
ticular,  the  goal  of  the  study  is  to  analyze  the  theoretical 
relationships  between  functional,  inductive  assertion  and  subgoal 
induction  proofs  of  correctness.  In  the  analysis,  it  is  observed 
that  the  prime  program  decomposition  strategy  causes  some  func¬ 
tional  correctness  proofs  to  be  somewhat  more  complex  than  their 
inductive  assertion  or  subgoal  induction  counterparts. 

A  primary  impediment  to  applying  functional  correctness  on  a 
large  scale  in  practice  is  the  requirement  that  each  WHILE  loop 
in  the  program  be  tagged  with  an  adequate  intended  function . 
Such  a  function  describes  the  intended  input/output  behavior  of 
the  loop  over  a  suitably  general  input  domain.  Vftile  supplying 
intended  functions  with  program  segments  is  no  doubt  a  good  pro¬ 
gramming  practice,  programmers  often  fail  to  supply  this  documen¬ 
tation  due  to  the  effort  and  difficulty  involved.  This  is  par¬ 
ticularly  true  of  intended  loop  functions  (as  opposed,  for  exam¬ 
ple,  to  those  for  initialized  loops)  since  they  tend  to  be  rela¬ 
tively  complex  and  difficult  to  state.  Furthermore,  most  program 
maintenance  is  currently  being  done  on  programs  with  undocumented 
intended  loop  functions.  Chapters  3,  4  and  5  deal  largely  with 
the  issue  of  verifying  programs  which  contain  WHILE  loops  with 
missing  or  inadequate  intended  functions. 

In  Chapter  3,  a  new  verification  strategy  for  composite 
structured  programs  is  described.  Based  on  functional  correct¬ 
ness,  the  technique  is  intended  to  be  applied  in  the  circumstance 
where  there  is  no  intended  function  supplied  for  a  WHILE  loop 
contained  in  the  program.  The  method  employs  a  reduction 
hypothesis  which  dispenses  with  the  need  for  the  intended  loop 
function.  Its  merit  is  due  to  the  fact  that  it  may  be  easier  to 
invent  a  reduction  hypothesis  than  to  create  the  intended  loop 
function.  Finally,  the  methodology  uses  the  idea  of  a  proof 
transformation  to  overcome  the  above  mentioned  deficiencies  of 
prime  program  decomposition  of  composite  programs. 

In  Chapter  4,  we  again  consider  the  problem  of  a  missing 
intended  loop  function  but  restrict  our  attention  to  initialized 
WHILE  loop  programs.  The  approach  taken  here  is  to  create  an 
adequate  intended  function  for  the  WHILE  loop  and  then  to  proceed 
with  a  standard  functional  correctness  proof.  A  heuristic  method 
is  presented  for  synthesizing  these  functions.  Although  the 
heuristic  seems  useful  in  a  wide  range  of  cases,  the  point  is 
made  that  its  successful  application  relies  on  the  program  exhi¬ 
biting  a  "reasonable”  form  of  loop  behavior. 

The  notion  of  a  uniformly  implemented  loop  is  defined  in 
Chapter  5.  Our  purpose  here  is  twofold.  First,  the  definition 
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is  intended  to  characterize  this  idea  of  "reasonable”  or  "well 
structured"  loop  behavior.  Secondly,  the  definition  serves  as  the 
basis  of  a  technique  for  systematically  generalizing  (i.e.  making 
more  descriptive)  intended  functions  for  uniformly  implemented 
WHILE  loops.  The  technique  can  be  used  to  enhance  a  "skeleton" 
intended  loop  function  to  the  point  where  it  is  adequate  for  a 
functional  proof  of  correctness. 

In  Chapter  6,  we  conclude  with  a  summary  of  our  research  and 
offer  several  suggestions  for  future  work. 
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2.  A  Comparative  Analysis  of  Functional  Correctness 

The  relationship  between  programs  and  the  mathematical  func¬ 
tions  they  compute  has  long  been  of  interest  to  computer  scien¬ 
tists  [McCarthy  63,  Strachey  66] .  More  recently,  [Mills  72,  75] 

has  developed  a  model  of  functional  correctness,  i.e.  a  technique 
for  verifying  a  program  correct  with  respect  to  an  abstract  func¬ 
tional  specification.  This  theory  has  been  further  developed  by 
[Basu  &  Misra  75,  Misra  78]  and  now  appears  as  a  viable  alterna¬ 
tive  to  the  inductive  assertion  verification  method  due  to  [Floyd 
67,  Hoare  69]  . 

In  this  chapter,  a  tutorial  view  of  the  functional  correct¬ 
ness  theory  is  presented  which  is  based  on  a  set  of  structured 
programming  control  structures.  An  implication  of  this  verifica¬ 
tion  theory  for  the  derivation  of  loop  invariants  is  discussed. 
The  functional  verification  technique  is  contrasted  and  compared 
with  the  inductive  assertion  and  subgoal  induction  techniques 
using  a  common  notation  and  framework.  In  this  analysis,  the 
functional  verification  conditions  concerning  program  loops  are 
shown  to  be  quite  similar  to  the  subgoal  induction  verification 
conditions  and  a  specialization  of  the  commonly  used  inductive 
assertion  verification  conditions.  Finally,  the  difficulty  of 
proving  initialized  loops  is  examined  in  light  of  the  functional 
and  inductive  assertion  theories. 

In  order  to  describe  the  functional  correctness  model,  we 
consider  a  program  P  with  variables  vl,  v2,  ...  ,  vn.  These 
variables  may  be  of  any  type  and  complexity  (e.g.  reals,  struc¬ 
tures,  files,  etc.)  but  we  assume  each  vi  takes  on  values  from  a 
set  di.  The  set  D  *  dl  x  d2  x  . . .  x  dn  is  the  data  space  for  P; 
an  element  of  D  is  a  data  state .  A  data  state  can  be  thought  of 
as  an  assignment  of  values  to  program  variables  and  is  written 
<cl,c2 , . . .  ,cn>  where  each  vi  has  been  assigned  the  value  ci  in 
d  i. 


The  effect  of  a  program  can  be  described  by  a  function  f  :  D 
->  D  which  maps  input  data  states  to  output  data  states.  If  P  is 
a  program,  the  function  computed  by  P,  written  [P] ,  is  the  set  of 
ordered  pairs  {(X,Y)  |  if  P  begins  execution  in  data  state  X,  P 
will  terminate  in  final  state  y}.  The  domain  of  [P]  is  thus  the 
set  of  data  states  for  which  P  terminates. 

If  the  specifications  for  a  program  P  can  be  formulated  as  a 
data-state-to-data-state  function  f,  the  correctness  of  a  program 
can  be  determined  by  comparing  f  with  [P].  Specifically,  we  say 
that  P  computes  f  if  and  only  if  f  is  a  subset  of  [P] .  That  is, 
if  f  (X)  *  Y  for  some  data  states  X  and  Y,  we  require  that  [P]  (X) 
be  defined  and  be  equal  to  Y.  Note  that  in  order  for  P  to  com¬ 
pute  f,  no  explicit  requirement  is  made  concerning  the  behavior 
of  P  on  inputs  outside  the  domain  of  f. 
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Example  2.1  -  Consider  the  simple  program 

P  ■  while  a  >  0  do 
ET” :*  b  *  a ; 
a  :•  a  -  1 
od . 

The  function  computed  by  the  program  can  be  written  as 

[P]»{ (<a,b> ,<0,b* (al) >)  |  a>»0}  U  { (<a,b> ,<a,b>)  |  a<0} . 

Thus  if  a  is  greater  than  or  equal  to  zero,  the  program  maps  a 
and  b  to  0  and  b*(a!)  respectively,  otherwise  the  program  per¬ 
forms  the  identity  mapping.  As  a  notational  convenience,  we 
often  use  conditional  rules  and  data-state-to-data-state  "assign¬ 
ments"  (called  concurrent  assignments)  to  express  functions.  In 
this  notation  we  have 

[PI  ■  (a>*0  ->  a,b  :■  0,b*(al)  (TRUE  ->  a,b  :*  a,b). 

Finally,  if  we  are  given  f  ■  (a>*0  ->  a,b  :*  0,b*(ai))  as  the 
function  to  be  computed,  we  may  say  that  P  computes  f,  since  f  is 
a  subset  of  [P] . 

2.1.  The  Functional  Correctness  Technique 

We  will  make  use  of  the  following  notation.  The  domain  of  a 
function  f  will  be  written  as  D(f).  The  notation  fog  will  be 
used  to  represent  the  composition  of  the  functions  g  and  f.  If  g 
is  a  function  or  binary  relation,  g**-l  represents  the  binary 
relation  which  corresponds  to  the  inverse  of  g.  We  will  use  the 
shorthand  B*H  for  the  WHILE  loop  program 

while  B (X)  do 
X  :=  H  (X) 
od . 

In  this  program,  X  represents  the  program  data  state,  B  is  a 
total  predicate  on  the  data  state,  and  H  is  a  data-state-to- 
data-state  function  which  represents  the  input/output  effect  of 
the  loop  body. 

The  functional  correctness  method  relies  heavily  on  a  tech¬ 
nique  for  verifying  that  a  WHILE  loop  computes  a  given  state -to- 
state  function,  we  present  this  WHILE  loop  technique  as  a 
theorem  and  then  describe  the  method  for  general  programs,  we 
first  need  the  following  definition. 

Definition  2.1  -  The  loop  B*H  is  closed  for  a  set  of  data 
states  S  if  and  only  if  (iff) 

X  6  S  &  B(X)  ->  H(X)  6  S. 

Intuitively,  a  loop  is  closed  for  S  if  the  data  state  remains  in 
S  as  it  executes  for  any  input  in  S. 

Theorem  2.1  -  If  the  loop  B*H  is  closed  for  the  domain  of  a 
function  f,  tEen  the  loop  computes  f  iff,  for  all  X  6  D(f) 

(2.1)  the  loop  terminates  when  executed  in 
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the  initial  state  X, 

(2.2)  B  (X)  ->  f(X)  *  £ (H  (X) )  ,  and 

(2.3)  ~B (X)  ->  f(X)  »  X. 

Proof  -  First,  suppose  (2.1),  (2.2),  and  (2.3)  hold.  Let 
X[0]  be  any  element  of  D(f).  By  condition  (2.1)  the  loop  must 
produce  some  output  after  a  finite  number  of  iterations.  Let  n 
represent  this  number  of  iterations,  and  let  X[n]  represent  the 
output  of  the  loop.  Furthermore,  let  X[l]  ,  X[2)  ,  ...  X[n-1]  be 

the  intermediate  states  generated  by  the  loop,  i.e.  for  all  i 
satisfying  0  <■  i  <  n,  we  have  B(X[i])  &  X[i+1]  »  H(X[i])  and 

also  ~B(X[n]).  Condition  (2.2)  shows  f(X[0])  *  f(X[l])  ■ 

...  •  f(X[nj).  Condition  (2.3)  indicates  f(X[n])  »  X[n]  .  Thus 
f(X(0] )  »  X[nj  and  the  loop  computes  f. 

Secondly,  suppose  the  loop  computes  f.  This  fact  would  be 
contradicted  if  (2.1)  were  false.  Suppose  (2.2)  were  false,  i.e. 
there  exists  an  X  6  D(f)  for  which  B(X)  but  f(X)  j*  f(H(X)).  From 
the  closure  requirement,  H(X)  e  D(f)  and  the  loop  produces 
f  (H  (X) )  when  given  the  input  H(X).  But  this  implies  the  loop  can 
distinguish  between  the  cases  where  H(X)  is  an  input  and  the  case 
where  H(X)  is  an  intermsdiate  result  from  the  input  X.  However, 
this  is  impossible  since  the  state  describes  the  values  of  all 
program  variables.  Finally,  if  (2.3)  were  false,  there  would 
exist  an  X  6  D(f)  for  which  the  loop  produces  X  as  an  output,  but 

where  f(X)  f  X.  Thus  the  loop  must  not  compute  f. 

An  important  aspect  of  Theorem  2.1  is  the  absence  of  the 
need  for  an  inductive  assertion  or  loop  invariant,  under  the 
conditions  of  the  theorem,  a  loop  can  be  proven  or  disproven 
directly  from  its  function  specification. 

Example  2.2  -  Using  the  loop  P  and  function  f  of  Example 
2.1,  we  shall  show  P  computes  f.  D(f)  is  the  set  of  all  states 
satisfying  a  >»  0.  Since  a  is  prevented  from  turning  negative  by 

the  loop  predicate,  the  loop  is  closed  for  D(f)  and  Theorem  2.1 

can  be  applied.  The  termination  condition  (2.1)  is  valid  since  a 
is  decremented  in  the  loop  body  and  has  a  lower  bound  of  zero. 
Since  H(<a,b>)  *  <a-l,  b*a>,  condition  (2.2)  is 
a  >  0  ->  f(<a,b>)  *  f(<a-l,b*a>) 

which  is 

a  >  0  ->  <0,b*(al)>  ■  <0 ,  (b*a)  *  ( (a-1)  1)  > 
which  can  be  shown  to  be  valid  using  the  associativity  of  *  and 
the  definition  al  -  a* ( ( a— 1 )  1 )  .  Condition  (2.3)  is 
a  ■  0  ->  <0,b*(al)>  ■  <a,b> 
which  is  valid  using  the  definition  0!  ■  1. 

The  functional  correctness  procedure  is  used  to  verify  a 
program  correct  with  respect  to  a  function  specification.  Large 
programs  must  be  broken  down  into  subprograms  whose  intended 
functions  may  be  more  easily  derived  or  verified.  These  results 
are  then  used  to  show  the  program  as  a  whole  computes  its 
intended  function.  The  exact  procedure  used  to  divide  the 


program  into  subprograms  is  not  specified  in  the  functional 
correctness  theory.  In  the  interest  of  simplicity,  the  technique 
presented  in  this  chapter  is  based  on  prime  program  decomposition 
[Linger,  Mills  &  Witt  79]  .  That  is,  correctness  rules  will  be 
associated  with  each  prime  program  (or  equivalently,  with  each 
statement  type)  in  the  source  language.  The  reader  should  keep 
in  mind,  however,  that  in  certain  circumstances,  other  decomposi¬ 
tion  strategies  may  lead  to  more  efficient  proofs.  One  such  cir¬ 
cumstance  is  illustrated  in  Section  2.4. 

In  our  presentation  of  the  functional  correctness  procedure, 
we  will  consider  simple  Algol-like  programs  consisting  of  assign¬ 
ment,  IF-THEN- ELSE,  WHILE  and  compound  statements.  Before  the 
correctness  technique  may  be  applied,  the  Intended  function  of 
each  loop  in  the  program  must  be  known.  Furthermore,  it  is 
required  that  each  loop  be  closed  for  the  domain  of  its  intended 
function.  These  intended  functions  must  either  be  supplied  with 
the  program  or  some  heuristic  (see  Chapters  4-5)  must  be  employed 
by  the  verifier  in  order  to  derive  a  suitable  intended  function 
for  each  loop.  This  need  for  intended  loop  functions  is  analo¬ 
gous  to  the  need  for  sufficiently  strong  loop  invariants  in  an 
inductive  assertion  proof  of  correctness. 

In  order  to  prove  that  a  structured  statement  S  (i.e.  a 
WHILE,  IF-TOEN-ELSE,  or  compound  statement)  computes  a  function 
f,  it  is  necessary  to  first  derive  the  function (s)  computed  by 
the  component  statement(s) ,  and  then  to  verify  that  S  computes  f 
using  the  derived  subfunctions.  Consequently,  the  functional 
correctness  technique  will  be  described  by  a  set  of  function 
derivation  rules  and  a  set  of  function  verification  rules.  These 
rules  are  given  in  Figure  2.1. 

Before  considering  an  example  of  the  use  of  these  rules,  we 
introduce  two  conventions  that  will  simplify  the  proofs  of  larger 
programs.  First,  we  allow  an  assignment  into  only  a  portion  of 
the  data  state  in  a  concurrent  assignment.  In  this  case  it  is 
understood  that  the  other  data-state  components  are  unmodified. 

Example  2.3  -  If  a  program  has  variables  vl,v2,v3,  the 
sequence  of  assignments 

vl  :»  4;  v3  :■  7 
performs  the  program  function 

vl,v3  :■  4,7 
which  is  shorthand  for 

vl,v2,v3  :*  4,v2,7. 

Secondly,  if  a  function  description  is  followed  by  a  list  of 
of  variables  surrounded  by  4  characters,  then  the  function  is 
intended  to  describe  the  program's  effect  on  these  variables 
only.  Other  variables  are  considered  to  have  been  set  to  an 
undefined  or  unspecified  value. 


Per ive  Rules  -  Used  to  compute  [S]. 

01:  S  *  v:«e 

1)  Return  [v:»e]  . 

D2:  S  »  S1;S2 

1)  Derive  [SI] 

2)  Derive  [S2] 

3)  Return  [S2]  o  [Si]  . 

D3:  S  *  i_f  B  then  Si  else  S2  fi 

1)  Derive  fs  1] 

2)  Derive  [S 21 

3)  Return  (B->  [Si]  |  TRUE->[S2]). 
D4s  S  *  while  B  do  SI  od 

1)  Let  f  be  the  intended  function 

(either  given  or  derived) 

2)  Verify  that  while  B  do  SI  od 

computes  f 

3)  Return  f. 

Verify  Rules  -  Used  to  prove  S  computes  f. 

VI:  S  »  v:»e 

1)  Derive  [S] 

2)  Show  f(X)-Y  ->  [S](X)  «  Y. 

V2:  S  »  S1;S2 

1)  Derive  [S] 

2)  Show  f(X)*Y  ->  [S]  (X)  »  Y. 

V3:  S  »  if  B  then  SI  else  S2  fi 

1)  Derive  fs] 

2)  Show  f  ( X)  *Y  ->  [S]  (X)  ■  Y. 

V4j  S  ■  while  B  do  Si  od 

1)  Derive~Tsi] 

2)  Apply  Theorem  2.1. 


Figure  2.1  Derive  and  Verify  Rules 


Example  2.4  -  If  a  program  has  variables  vl,v2rv3  that  take 
on  values  from  3l,d2,d3,  respectively,  the  function  description 
f  ■  (vl  >  0  ->  v2,v3  :■  vl+v3,v2)  #v2,v3t 
is  equivalent  to 

(vl  >  0  ->  vl ,v2 ,v3  :■  ?,vl+v3,v2), 
where  ?  represents  an  unspecified  value.  Note  that  in  a  sense, 
functions  like  f  are  not  data-state-to-data-state  functions:  more 
accurately  they  are  general  relations.  E.g.  in  the  exanple, 
<1 ,2 , 3>  maps  to  <2,4,2>  as  well  as  <-2,4,2>.  However,  we  adopt 
the  view  that  f  is  a  dl  x  d2  x  d3  to  d2  x  d3  mapping  and  in  this 
light,  f  is  a  function.  We  call  {v2,v3}  the  range  set  for  f. 
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written  RS(f).  Functions  not  using  the  #  notation  are  assumed  to 
have  the  entire  set  of  variables  as  their  range  set.  Similarly, 
if  the  variables  vrl,vr2,. .  ,vrk  are  the  necessary  inputs  to  a 
function  description  f,  we  say  that  {vrl,vr2,. . .  ,vrk}  is  the 
domain  set  for  f,  written  DS(f).  In  Example  2.4,  the  domain  set 
Tor  F  is  {vl,v2,v3}  which  happens  to  be  the  entire  set  of  vari¬ 
ables,  but  this  need  not  be  the  case.  Note  that  some  functions 
(i.e.  constant  functions)  may  have  an  empty  domain  set. 


Example  2.5  -  Consider  the  following  program 


Cl) 

1) 

C2) 

2) 

3) 
C3) 

4) 

5) 

6) 

7) 

8) 
9) 

10) 


—  (n>»0  ->  s  :■  SOM(i,l,m, i**n) )  #s# 

a  :*  1;  s  :*  0; 

—  (n>»0  ->  s  :■  s  +  SUM(  i,a,m,  i**n) ) 


while  a  <■  m  do 

J" !»  0;  p  s»  1; 

-  (n>«j  ->  p,  j  :■  p*a**(n-j)  ,n) 
while  i  <  n  do 

— r»-  j  ♦  tt 

p  :■  p  *  a 
od; 

s  :=  s  +  p; 
a  :■  a  +  1 


od . 


ts* 


In  this  example,  the  functions  on  the  lines  labeled  Cl,  C2  and  C3 
are  program  comments  and  define  the  intended  functions  for  the 
program,  outer  WHILE  loop  and  inner  WHILE  loop  respectively.  The 
notation  SUM(a,b,c,d)  used  in  these  functions  stands  for  the  sum¬ 
mation  from  a=b  to  c  of  the  expression  d.  We  use  the  notation 
Fn-m  as  the  derived  function  for  lines  n  through  m  of  the  pro¬ 
gram. 

Step  1)  -  Using  derive  rules  Dl  and  D2  we  get 
F5-6  »  j,p  :■  j+l,p*a. 

Step  2)  -  We  must  verify  the  inner  loop  computes  its  in¬ 
tended  function.  The  closure  condition  and  termination 
condition  are  easily  verified.  The  other  conditions 
are 

j<n  ->  <p*a**  (n-j) , n>  ■  <  (p*a)  *a**  (n-(  j+1) )  ,n> 

and 

j=n  ->  <p*a** (n-j) , n>  *  <p,j> 
which  are  true. 

Step  3)  -  Using  Dl  and  D2  we  derive  F3-7  as  follows: 

F3-7  »  (n>»j  ->  p,  j  :■  p*a**(n-j)  ,n)  o  F3-3 

*  (n>«j  ->  p,  j  :*  p*a** (n-j)  ,n)  o  j,p  :■  0,1 

■  (n>"0  ->  p,j  :■  a**n,n). 

Step  4)  -  Again  with  Dl  and  D2  we  derive  F3-9: 

F3-9  ■  F8-9  o  (n>»0  ->  p,  j  :■  a**n,n) 

■  s,a  :•  s+p,a-tl  o  (n>«0  ->  p,  j  :»  a**n,n) 

»  (n>«0  ->  p, j,s,a  :*  a**n, n, s+a**n, a+1) . 

Step  5)  -  Now  we  are  ready  to  show  the  outer  loop  computes 
its  intended  function.  Again  the  closure  and  termina¬ 
tion  conditions  are  easily  shown.  The  remaining  condi- 
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tions  ace  (where  n>»0) 

a<^n  ->  s+SUM(  i,  a,  m,  i**n)  ■  (s+a**n) +SUM(  i,a+l,m,  i**n) 
and 

a>m  ->  s+SUM(  i,a,m,  i**n)  »  s, 
both  of  which  ace  true. 

Step  6)  -  We  now  derive  Fl-10.  Applying  D2  we  get 
Fl-10  *  (n>»0  ->  s  s«  s+SUM( i,  a,m, i**n) ) #s#  o  fl-1 

*  (n>-0  ->  s  :■  s+SUM(  ir  a, m,  i**n) )  #s#  o  a,s  1,0 

■  (n>-0  ->  s  :»  SUM(i,l,m,  i**n)  )#s#. 

Step  7)  -  Since  the  intended  program  function  agrees  with 
Fl-10,  we  conclude  the  program  computes  its  intended 
function. 

The  functional  correctness  technique  was  developed  by  [Mills 
72,  Mills  75]  .  This  verification  method  is  compared  and  con¬ 
trasted  with  the  inductive  assertion  technique  in  [Basili  & 
Noonan  80]  .  The  presentation  here  is  based  on  prime  program 
decomposition  of  composite  programs  and  emphasizes  the  distinc¬ 
tion  between  function  derivation  and  function  verification  in  the 
correctness  procedure. 

The  essential  idea  behind  Theorem  2.1  can  be  traced  to 
[McCarthy  62,  McCarthy  63]  ,  in  which  a  technique  for  proving  two 
functions  equivalent,  "recursion  induction,"  is  described. 
Theorem  2.1  can  be  viewed  as  a  specific  application  of  this  tech¬ 
nique.  In  [Manna  &  Pnueli  70,  Manna  71]  and  more  recently 
[Morris  &  Wegbreit  77],  loop  verification  rules  similar  to  that 
stated  in  Theorem  2.1  are  suggested.  In  [Basu  &  Misra  75],  the 
authors  prove  a  result  which  corresponds  to  Theorem  2.1  for  the 
case  where  the  loop  contains  local  variables. 

The  closure  requirement  of  Theorem  2.1  has  received  consid¬ 
erable  attention.  Several  classes  of  loops  which  can  be  proved 
without  the  strict  closure  restriction  are  discussed  in  [Basu  & 
Misra  76,  Misra  78,  Misra  79,  Basu  80].  In  [Wegbreit  77],  how¬ 
ever,  the  author  describes  a  class  of  programs  for  which  the 
problem  of  "generalizing"  a  loop  specification  in  order  to 
satisfy  the  closure  requirement  is  NP-coraplete. 

2.2.  The  Loop  Invariant  f (X0)  ■  f (X) 

An  important  implication  of  Theorem  2.1  is  that  a  loop  rtiich 
computes  a  function  must  maintain  a  particular  property  of  the 
data  state  across  iterations.  Specifically,  after  each  itera¬ 
tion,  the  function  value  of  the  current  data  state  must  be  the 
same  as  the  function  value  of  the  original  input.  In  this  sec¬ 
tion  we  discuss  and  expand  on  this  characteristic  of  a  loop  which 
is  closed  for  the  domain  of  a  function  it  computes. 

A  loop  assertion  for  the  loop  B*H  is  a  boolean-valued 
expression  which  yields  the  value  TRUE  just  prior  to  each  evalua¬ 
tion  of  the  predicate  B.  In  general,  a  loop  assertion  A  is  a 
function  of  the  current  values  of  the  program  variables  (which  we 


-10 


will  denote  by  X) ,  as  well  as  the  values  of  the  program  variables 
on  entry  to  the  loop  (denoted  by  XO).  To  enqphasize  these  depen¬ 
dencies  we  write  A(X0/X)  to  represent  the  loop  assertion  A. 


Let  D  be  a  set  of  data  states.  A  loop  invariant  for  B*H 
over  a  set  D  is  a  boolean  valued  expression  A(x6,X)  wFich  satis¬ 
fies  the  following  conditions  for  all  X0,x  6  D 

(2.4)  A(X0,X0) 

(2.5)  A(X0,X)  &  B  (X)  ->  A(X0,H(X))  &  (H  (X)  €  0). 

Thus,  if  A(X0 ,x)  is  a  loop  invariant  for  B*H  over  D,  then  A(X0,X) 
is  a  loop  assertion  under  the  assumption  the  loop  begins  execu¬ 
tion  in  a  data  state  in  0.  Furthermore,  the  validity  of  this 
fact  can  be  demonstrated  by  an  inductive  argument  based  on  the 
number  of  loop  iterations. 

Loop  assertions  are  of  interest  because  they  can  be  used  to 
establish  conditions  which  are  valid  when  (and  if)  the  execution 
of  the  loop  terminates.  Specifically,  any  assertion  which  can  be 
inferred  from 

(2.6)  A  (XO  ,X)  &  ~B  (X) 

will  be  valid  immediately  following  the  loop. 

It  should  be  clear  that  for  any  loop  B*H,  there  may  be  an 
arbitrary  number  of  valid  loop  assertions.  Indeed,  the  predicate 
TRUE  is  a  trivial  loop  assertion  for  any  WHILE  loop.  However, 
the  stronger  (more  restrictive)  the  loop  assertion,  the  more  one 
can  conclude  from  condition  (2.6).  For  a  given  state -to -state 
function  f,  we  say  that  A(X0,X)  is  an  f-adequate  loop  assertion 
iff  A ( XO , X)  is  a  loop  assertion  and  A(X07x}  can  be  used  In  ver i- 
fying  that  the  loop  computes  the  function  f.  More  precisely,  if 
f  is  a  function,  the  condition  for  a  loop  assertion  A(X0,X)  being 
an  f-adequate  loop  assertion  is 

(2.7)  XO  e  D(f)  &  A  (XO  ,X)  &  ~B  (X)  ->  X-f(XO). 

A  loop  invariant  A(X0,X)  over  some  set  containing  0(f)  for  which 
condition  (2.7)  holds  is  an  f-adequate  loop  invar iant . 


Example  2.6  -  Let  P  denote  the  program 

while  a  £  {0 ,1}  do 
~~  TF  a  >  0  then 
a  :■  a  -  2 
else  a  :•  a  +  2  fi 
53 17”  ~ 


Consider  the  following  predicates 
Al(<aO>,<a>)  <->  TRUE 
A2(<a0>,<a>)  <->  ABS  (a)  <«ABS(aO) 

A3(<a0>,<a>)  <->  ODD  (a)  •  ODD(aO) 

A4(<a0>,<a>)  <->  ODD  ( a)  ■  ODD(aO)  6  AB8  (a)  <■  ABS(aO) 
A5(<a0>,<a>)  <->  ODD  (a)  «  ODD(aO)  OR  (a«3  6  a0»2) 
where  ABS  denotes  an  absolute  value  function,  and  ODD  returns  1 
if  its  argument  is  odd  and  0  otherwise.  Bach  of  the  5  predicates 
is  a  loop  assertion.  Let  D  be  the  set  of  all  possible  data 


states  for  P  (i.e.  D  *  {<a>  |  a  is  an  integer}).  Let  f  ■ 
j  (<a>  ,<ODD(a)  > )  |  a  is  an  integer}/  and  consider  A3.  Since  a  e 
{0/1}  implies  a*ODD(a),  we  can  infer  a*ODD(aO)  from  A3(<a0>,<a>) 
&  a  6  {0,l} .  Thus  A3  is  an  f-adequate  loop  assertion.  Simi¬ 
larly/  A4  and  A5  are  f-adequate  loop  assertions/  but  neither  A1 
nor  A2  is  restrictive  enough  to  be  f-adequate.  Predicates  A3  and 
A4  are  loop  invariants  over  D;  however/  since  AS  fails  (2.5) ,  it 
is  not  a  loop  invariant  (a*3,a0*2  is  a  counter  exaaple) . 

Theorem  2.2  -  If  B*H  is  closed  for  D(f)  and  B*H  confutes  f 
then  f (XO)  *  f  (X)  is  an  f-adequate  loop  invariant  over  D(f)/  and 
furthermore/  it  is  the  weakest  such  loop  invariant  in  the  sense 
that  if  A(X0 , X)  is  any  f-adequate  loop  invariant  over  D(f)r 
A(XO/X)  ->  f(X)*f(X0)  for  all  X,X0  ©  D(f). 

Proof  -  First  we  show  that  f(X)*f(X0)  is  a  loop  invariant 
over  D(f).  Condition  (2.4)  is  f(X0)*f(X0).  From  Theorem  2.1/ 
for  all  X  e  D(f) , 

B  (X)  ->  f(X)  -  f  (H  (X)  )  . 

Thus  for  all  XrX0  ©  D(f), 

B(X)  &  f(X0)*f(X)  ->  f  (X0)»f  (X)»f  (H  (X)  )  ->  f  (XO)-f  (H(X)  )  . 

Adding  the  closure  condition  B(X)  ->  H  (X)  e  D(f)  yields  condition 

(2.5) .  Thus  f (X) *f (XO)  is  a  loop  invariant  over  D(f).  Again 

from  Theorem  2.1,  for  all  X  6  D(f)  , 

~B  (X)  ->  f(X)*X. 

Thus  for  all  XO  6  D(f), 

f(X)*f(X0)  &  ~B  (X)  ->  f(X)*f(X0)  &  f(X)-X  ->  f(X0)-X 
which  shows  f(X)*f(X0)  is  f-adequate.  Let  A(X0,X)  be  any  f- 
adequate  loop  invariant  for  B*H  over  D(  f) ,  and  let  Z0,Z  be  ele¬ 
ments  of  D(f)  such  that  A(Z0,Z).  Since  B*H  computes  f  and  Z  6 
D(  f) ,  there  exists  some  sequence  Z  [1]  ,Z  [2]  ,  ...  ,Z  [nj  (possibly 
with  n*l)  where  Z[1]*Z,  Z[n]-f(Z),  ~B(Z[n)),  and  with  B(Z[i])  & 

Z[i+1]  *  H  (Z  [ i] )  for  all  i  satisfying  1  <*  i  <  n.  By  condition 

(2.5)  we  have  A(Z0/Z[1]),  A(Z0,Z{2]),  ...  ,A(Z0,Z[N]);  thus 
A(Z0,f  (Z) )  and  ”B(f(Z)).  Since  ZO  e  D(f)  and  A(X0,X)  is  f- 
adequate , 

A (ZO, f (Z ) )  &  “B (f (Z) )  ->  f(Z0)»f(Z) 
from  condition  (2.7).  Thus  for  all  ZOrZ  ©  D(f), 

A (ZO/Z )  ->  f(Z0)»f(Z). 

Example  2.6  (continued)  -  In  this  example,  A3  is  of  the  form 
f  (X) *f  (XO) .  A3  is  clearly  weaker  than  the  other  f-adequate  loop 
invariant  A4.  It  is  worth  noting  that  A3  is  not  weaker  than  A5, 
but  A5  is  not  a  loop  invariant,  and  A3  is  not  weaker  than  A2,  but 
A2  is  not  f-adequate.  This  situation  is  illustrated  in  Figure 
2.2.  The  sets  labeled  S1-S5  are  the  sets  of  ordered  pairs 
(<aO>,<a>)  satisfying  A1-A5,  respectively,  i.e. 

SI *{ ( <a0> , <a> )  1  AI  (<a0> ,<a>) } 

for  1*1, 2,3,4  and  5.  The  diagram  is  partitioned  in  half  with  a  £ 
{ 0,1}  on  the  left  and  a  e  {0,1}  on  the  right.  Note  that  S4  (or 
the  set  corresponding  to  any  f-adequate  loop  invariant  for  that 
matter)  is  a  subset  of  S3.  Furthermore,  the  set  corresponding  to 
each  f-adequate  loop  assertion  is  identical  where  a  ©  {0,1} . 


This  region  of  the  diagram  is  precisely  the  set  f. 

Consider  the  problem  of  using  Hoare's  iteration  axiom 
(2.8)  I  6  B  (X:«H(X)}  I  ->  I  {b*H}  I  &  ~B 
to  prove  the  loop  B*H  computes  a  function  f  where  B*H  is  closed 
for  0(f).  In  our  terminology,  if  B*H  is  correct  with  respect  to 
f,  I  must  be  a  loop  invariant  over  (at  least)  the  set  0(f)  (oth¬ 
erwise  X«f(X0)  for  all  XO  0  0(f)  cannot  be  inferred).  However, 
using  a  loop  invariant  over  a  proper  superset  of  D(f)  is  in  gen¬ 
eral  unnecessary,  unless  one  is  trying  to  show  the  loop  computes 
some  proper  superset  of  f.  If  we  choose  to  use  a  loop  invariant 
I  over  exactly  D(f),  Theorem  2.2  tells  us  that  f(X)»f(X0)  is  the 


Sl«> 


Figure  2.2  The  Sets  S1-S5 
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weakest  invariant  that  will  do  the  job.  In  a  sense,  the  weaker 
an  invariant  is,  the  easier  it  is  to  verify  that  it  is  indeed  a 
loop  invariant  (i.e.  that  the  antecedent  to  (2.8)  is  true), 
because  it  says  less  (is  less  restrictive,  is  satisfied  by  more 
data  states,  etc.)  than  other  loop  invariants.  Along  these 
lines,  one  might  conclude  that  if  a  loop  is  closed  for  the  domain 
of  a  function  f.  Theorem  2.2  gives  a  formula  for  the  "easiest" 
loop  invariant  over  D(f)  that  can  be  used  to  verify  the  loop  corn- 
putes  f. 

Let  us  again  consider  loop  invariants  and  functions  as  sets 
of  ordered  pairs  of  data  states.  Let  B*H  compute  f  and  let 
A (XO ,x)  be  an  f-adequate  loop  invariant.  It  is  clear  that  in 
this  case 

{(X0,X)  |  A  ( XO  ,  X)  &  ~B  (X)  &  (XO  e  D(  f)  )  } 

is  precisely  f.  That  is,  f  must  be  the  portion  of  the  set 
represented  by  A(X0,X)  obtained  by  restricting  the  domain  to  D(f) 
and  discarding  members  whose  second  component  cause  B  to  evaluate 
to  TRUE.  Can  the  set  represented  by  A(X0,X)  be  determined  from 
f?  No,  since  in  general,  there  are  many  f-adequate  invariants 
over  D(f)  and  the  validity  of  some  will  depend  on  the  details  of 
B  and  H  (e.g.  A 4  in  Example  2.6).  However,  Theorem  2.2  gives  us 
a  technique  for  constructing  the  only  f-adequate  invariant  over 
D(f)  that  will  be  valid  for  any  B  and  H,  provided  B*H  computes  f 
and  is  closed  for  D(f).  Specifically,  this  invariant  couples  an 
element  of  D(f)  with  any  other  element  of  D(f)  which  belongs  to 
the  same  level  set  of  f.  (S  is  a  level  set  of  f  iff  there  exists 
a  Y  such  that  S»{x|  f(X)«Y}) .  Put  another  way  -  all  f-adequate 
loop  invariants  over  D(f)  describe  what  the  loop  does  (i.e.  they 
can  be  used  to  show  the  loop  computes  f)  ,  and  some  may  also  con¬ 
tain  information  about  how  the  final  result  is  achieved.  That 
is,  one  might  be  able  to  use  an  f-adequate  loop  invariant  to  make 
a  statement  about  the  intermediate  states  generated  by  the  loop 
on  some  inputs.  The  intermediate  states  "predicted"  by  the  weak¬ 
est  invariant  f(X)*f(X0)  is  the  set  of  all  intermediate  states 
that  could  possibly  be  generated  by  any  loop  B*H  that  computes 
the  function.  Thus,  the  invariant  f(X)»f(X0)  can  be  thought  of 
as  occupying  a  unique  position  in  the  spectrum  of  all  possible 
loop  invariants:  it  is  strong  enough  to  describe  the  net  effect 
of  the  loop  on  the  input  set  D(f)  and  yet  is  sufficiently  weak 
that  it  offers  no  hint  about  the  method  used  to  achieve  the 
effect. 

Example  2.7  -  Consider  the  following  program 

while  a  >  0  do 
a  :»  a  -  17 
c  :■  c  b 
od. 

This  loop  computes  the  function 

f  *  (a>»0  ->  a,b,c  :■  0,b,c+a*b). 

From  Theorem  2.2,  we  know  that 
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A(<aO,bO,cO>  ,<a,b,c>)  <->  <0,b0  ,c0+a0*b0>«<0  ,b, c+a*b> 
is  the  weakest  f -ad equate  invariant  over  D(f)»{<a,b,c>  |  a>«0} . 

Consider  the  sample  input  <4,10,7>.  Our  loop  will  produce  the 
series  of  states  <4,10,7>,  <3,10,17>,  <2,10,27>,  <1,10,37>, 

<0,10,47>.  Of  course,  our  invariant  agrees  with  these  intermedi¬ 
ate  states  (i.e.  A  (<4 ,10 ,7>  ,<4 ,10 ,7> ) ,  A(<4,10,7>  ,<3,10,17>) ,  ... 
,  A (<4 ,10 ,7>  , <0 ,10 ,47> ) ) ,  but  it  also  agrees  with  <6,10,-13>.  We 
conclude  then,  that  it  is  possible  for  some  loop  which  computes  f 
to  produce  an  intermediate  state  <6,10,-13>  while  mapping 
<4,10,7>  to  <0,10,47>.  Furthermore,  no  loop  which  computes  f 
could  produce  <6,10,-12>  as  an  intermediate  state  from  the  input 
<4,10,7>  since  the  invariant  would  be  violated. 

To  emphasize  this  point,  we  define  an  f-adequate  invariant 
A ( X0 , X)  over  D(f)  for  B*H  to  be  an  internal  invariant  if  A(X0,X) 
implies  that  B*H  will  generate  X  as  an  intermediate  state  when 
mapping  X0  to  f(X0).  Intuitively,  an  internal  invariant  captures 
what  the  loop  does  as  well  as  a  great  deal  of  how  the  loop  works. 
In  our  example,  b«b0  &  c»c0+b*  (aO-a)  &  0<»a<*a0  is  an  internal 
invariant,  but  A(<a0,b0,c0>,<a,b,c>)  as  defined  above  is  not  (the 
state  <6,10,-13>  on  input  <4,10,7>  is  a  counter  example).  It  can 
be  proven  that  if  f  is  any  nonempty  function  other  than  the  iden¬ 
tity  function,  no  loop  for  computing  f  ex  is  vs  for  which 
f(X)*f(X0)  is  an  internal  invariant[l]  .  However,  if  we  consider 
nondeterministic  loops  and  weaken  the  definition  of  an  internal 
invariant  to  one  where  A(X0,X)  implies  X  may  be  generated  by  B*H 
when  mapping  X0  to  f(X0),  such  a  loop  can  always  be  found.  This 
loop  would  nondeterministically  switch  states  so  as  to  remain  in 
the  same  level  set  of  f.  Our  example  program  could  be  modified 
in  such  a  manner  as  follows: 

while  a  >  0  do 

t  :«  "some  integer  value  greater  than  or  equal 

to  zero"; 

c  :»  c  +  b  *  (a-t) ; 
a  :■  t 
od 

and  corresponds  to  a  "blind  search"  implementation  of  the  func¬ 
tion. 

In  [Basu  &  Misra  75]  ,  the  authors  emphasize  the  difference 
between  loop  invariants  and  loop  assertions.  The  fact  that  f(X) 


[1]  Outline  of  proof:  Let  f(Y);<Y,  B*H  compute  f  and  suppcse 
the  f-adequate  invariant  f(X)-f(X0)  over  D(f)  for  B*H  is  an 
internal  invariant,  fte  must  have  B  (Y) .  By  (2.2)  f(Y)»f  (H(Y) ) . 
Consider  H(Y)  as  a  fresh  input.  Since  f(X)>f(X0)  is  an  internal 
invariant  and  f  ( Y)  «f  (H  (Y) ) ,  the  loop  must  eventually  produce  in¬ 
termediate  state  Y,  which  must  then  produce  H(Y) .  Thus  B*H  fails 
to  terminate  and  does  not  compute  f. 
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■  f(XO)  U  an  f-edequate  loop  invariant  appears  in  [Basu  a  Misra 
75,  Linger,  Kills  a  Witt  79]  .  The  independence  of  this  loop 
invariant  from  the  characteristics  of  the  loop  body  is  discussed 
in  (Basu  a  Misra  75]  . 

2.3.  Com par inq  the  Hoare  and  Mills  Loop  Proof  Rules 

An  alternative  to  using  Theorem  2.1  in  showing  a  loop  com¬ 
putes  a  function  is  to  apply  Boare's  inductive  assertion  verifi¬ 
cation  technique.  That  is,  one  could  verify  P  {b*h}  Q  where 

P  <->  x»xo  a  x  6  D(  f) ,  and 
Q  <->  X-f(XO) 

by  demonstrating  the  following  for  some  predicate  I: 

All  P  ->  I 

A2:  B  a  I  {x  :*H  (X)  }  I 

A3:  “B  a  I  ->  Q. 

Strictly  speaking,  conditions  A1  through  A3  show  partial  correct¬ 
ness;  to  show  total  correctness,  one  must  also  prove 

A4:  B*H  terminates  for  any  input  state  satisfying  P. 

We  now  wish  to  compare  these  verification  conditions  with 
the  functional  verification  conditions.  Recalling  from  Theorem 
2.1,  if  B*H  is  closed  for  D(f)  ,  the  functional  verification  rules 
are : 

FI:  X  e  D(f)  ->  B*H  terminates  for  the  input  X 
F2:  X  e  D(f)  a  B  (X)  ->  f(X)  -  f(H(X)) 

F3:  X  6  D(f)  a  ~B  (X)  ->  f(X)  «  X. 

In  the  following  discussion  we  adept  the  convention  that  if  f  is 
a  function  and  X  is  not  in  D(f),  then  f(X)*Z  is  false  for  any 
formula  Z. 

Theorem  2.3  -  Let  B*H  be  closed  for  D(f).  If  f(X)*f(X0)  is 
used  as  the  predicate  I  in  A1-A3,  then  Al  a  A2  a  A3  a  A 4  <->  FI  a 
F2  a  F3.  That  is,  the  functional  verification  conditions  F1-F3 
are  equivalent  to  the  special  case  of  the  inductive  assertion 
verification  conditions  A1-A4  which  results  from  using  f(X)*f(X0) 
as  the  predicate  I.  In  particular,  if  I  <->  f(X)»f(X0)  in  the 
inductive  assertion  rules,  then 
Al  < — >  TRUE, 

A2  <->  F2  provided  X  6  D(f)  a  B(X)  ->  X  6  0(H) , 

A3  <->  F3, 

A4  <->  FI. 

Proof  -  We  begin  by  noting  that  the  termination  conditions 
A4  and  PI  are  identical,  thus  A4  <->  Fl.  Secondly  Al  is 
X-XO  a  X  6  D(f)  ->  f(X)-f(X0) 

which  is  clearly  true  for  any  f.  Combining  with  our  first  result 
yields  Al  a  A4  <->  Fl.  Condition  A3  can  be  rewritten  as 
~B(X)  a  f(X)-f(X0)  ->  x-f(XO) 

which  is  trivially  true  for  any  X,X0  outside  D(f).  Thus  A3  may 
be  rewritten  as 

A3':  X,X0  e  D(f)  a  “B  (X)  a  f(X)-f(X0)  ->  X*f(X0). 

Note  that  A3'  ->  F3  by  considering  the  case  where  X«XQ . 
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Furthermore,  F3  ->  A3>  by  considering  the  case  where  XO  6  D(f) 

and  f(X)*f(X0).  Now  we  have  A3  <->  A3'  <->  F3  and  adding  this  to 
our  result  above  we  get  Al  &  A3  &  A4  <->  Fl  &  F3.  We  next  prove 
A2  &  A4  <->  Fl  &  F2.  This  combined  with  the  above  equivalence 
yields  the  desired  result  Al  &  A2  &  A3  &  A4  <->  Fl  &  F2  &  F3. 
Note  that  if  there  exists  an  X  6  D(f)  such  that  B(X)  but  H(X)  is 
not  defined,  then  the  loop  itself  will  be  undefined  for  X,  both 
A4  and  Fl  will  be  false  and  A2  &  A4  <->  F2  &  Fl.  We  now  consider 
the  other  case  where  for  all  X  6  D(f),  B(X)  ->  X  G  D(H).  In  this 
situation  we  will  show  A2  <->  F2;  combining  with  A4  <->  Fl  yields 
A2  &  A4  <->  F2  &  Fl.  Rule  A2  may  be  rewritten  as 
B  (X)  &  f(X)»f(X0)  { X  :*H  (X)  }  f(X)*f(X0) 
which  again  is  trivially  true  if  X  or  XO  is  outside  D(f)  ;  thus  A2 
is  equivalent  to 

X,X0  6  D(f)  &  B  (X)  &  f  { X)  =f  { XO )  { X  :  “H  ( X)  }  f(X)-f(X0). 

Since  H  is  defined  for  any  input  X  e  D(f)  such  that  B(X)  by 
hypothesis,  this  may  be  transformed  using  Hoare's  axiom  of 
assignment  to  the  implication 
A2':  X,X0  G  D(f)  &  B(X)  &  f(X)«f(X0)  ->  f  (H  (X)  )  *f  ( XO)  . 

As  before,  we  can  show  A2'->F2  by  considering  the  case  where 
X*X0,  and  F2->A2"  by  cot  sidering  the  case  where  XO  6  D(f)  and 
f(X)=f(X0).  Thus  A2  <->  A2'  <->  F2  which  implies  A2  <->  F2. 

This  completes  the  proof  of  the  theorem. 

The  purpose  of  Theorem  2.3  is  to  allow  us  to  view  the  func¬ 
tional  verification  conditions  as  verification  conditions  in  an 
inductive  assertion  proof.  Not  surprisingly,  both  techniques 
have  identical  termination  requirements.  If  the  termination  con¬ 
dition  is  met,  F2  amounts  to  a  proof  that  f(X)*f(XO)  is  a  loop 
invariant  predicate.  Condition  F3  amounts  to  an  application  of 
the  "Rule  of  Consequence,"  testing  that  the  desired  result  can  be 
implied  from  the  predicate  f(X)*f(X0)  and  the  negation  of  the 
predicate  B. 

2 .4.  Subgoal  Induction  and  Functional  Correctness 

Subgoal  induction  is  a  verification  technique  due  to  [Morris 
&  Wegbreit  77]  .  It  is  based  largely  on  work  appearing  in  [Manna  & 
Pnueli  70,  Manna  71]  .  In  this  section  we  compare  subgoal  induc¬ 
tion  to  the  functional  correctness  approach  described  above. 

We  first  note  that  subgoal  induction  can  be  viewed  as  a  gen¬ 
eralization  of  the  functional  approach  presented  here  in  that 
subgoal  induction  can  be  used  to  prove  a  program  correct  with 
respect  to  a  general  input-output  relation.  A  consequence  of 
this  generality,  however,  is  that  the  subgoal  induction  verifica¬ 
tion  conditions  are  sufficient  but  not  necessary  for  correctness; 
that  is,  in  general,  no  conclusion  can  be  drawn  if  the  subgoal 
induction  verification  conditions  are  invalid.  Provided  the  clo¬ 
sure  requirement  is  satisfied,  the  functional  verification  condi¬ 
tions  (as  well  as  the  subgoal  induction  verification  conditions 
when  applied  to  the  same  problem)  are  sufficient  and  necessary 
conditions  for  correctness.  Results  in  [Misra  77]  suggest  that 


it  is  not  possible  to  obtain  necessary  verification  conditions 
for  general  input-output  relations  without  considering  the 
details  of  the  loop  body. 

In  order  to  more  precisely  compare  the  two  techniques,  we 
consider  the  flow  chart  program  in  Figure  2.3  adapted  from 
[Morris  &  Wegbreit  77]  .  In  the  figure,  Pl,P2,P3  and  P4  are 
points  of  control  in  the  flow  chart,  X  represents  the  program 
data  state,  B  is  a  predicate  on  the  data  state  and  K,H  and  Q  are 
data -state -to -data -state  functions.  Note  that  this  flow  chart 
program  amounts  to  a  WHILE  loop  surrounded  by  pre  and  post  pro¬ 
cessing.  Our  goal  is  to  prove  the  program  computes  a  function  f. 
Morris  &  Wtegbreit  point  out  that  subgoal  induction  uses  an  induc¬ 
tion  on  the  P2  to  P4  path  of  the  computation?  that  is,  one 
selects  some  relation  v,  inductively  shows  it  holds  for  all  P2  to 
P4  execution  paths,  and  then  uses  v  to  show  f  is  computed  by  all 
Pi  to  P4  execution  paths.  In  our  application,  since  f  is  a  func¬ 
tion,  it  will  be  required  that  v  itself  be  a  function.  Once  v 
has  been  selected,  the  verification  conditions  are 
SI:  X  e  D(v)  &  ~B  (X)  ->  v(X)  *  Q(X) 

S2:  X  e  D(v)  &  B  (X)  ->  v(H(X>)  =  v(X) 

S3:  X  6  D(f)  ->  f(X)  -  v ( K ( X)  )  . 


******** 

*  Pl  * 
******** 


X  :*  K ( X) 


******** 

■>  *  P2  * 

******** 

I 

/  \ 

/  \  No  ******** 

|  B  (X)  ?  | - >*  P3  *--> 

\  /  ******** 

\  / 

I  Yes 


Q  ( X) 


******** 

— >*  P4  * 
******** 


X  :■  H  (X) 


Figure  2.3  Flow  Chart  Program 
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Note  that  SI  and  S2  test  the  validity  of  v;  S3  checks  that  v  can 
be  used  to  show  the  program  computes  f. 

The  functional  verification  theory  presented  here  is  similar 
with  the  exception  that  the  function  Q  is  not  included  in  the 
induction  path.  We  select  some  function  g  and  show  it  holds  for 
all  P2  to  P3  execution  paths  (i.e.  we  show  the  WHILE  loop  com¬ 
putes  g)  and  then  use  g  to  show  f  is  computed  by  all  PI  to  P4 
execution  paths.  Once  g  has  been  selected,  the  verification  con¬ 
ditions  are 

Pis  X  6  D(g)  &  ~B  (X)  ->  g(X)*X 

F2:  X  6  D(g)  &  B  (X)  ->  g(H(X))  =  g ( X) 

P 3:  X  e  D(f)  ->  f(X)  *  Q(g(K(X))). 

Note  that  both  techniques  require  the  invention  of  an  inter¬ 
mediate  hypothesis  which  must  be  verified  in  a  "subproof."  This 
hypothesis  is  then  used  to  show  the  program  computes  f.  The 
function  Q  in  the  flow  chart  program  is  absorbed  into  the  inter¬ 
mediate  hypothesis  in  the  subgoal  induction  case;  it  is  separate 
from  the  intermediate  hypothesis  in  the  functional  case.  Indeed, 
the  two  intermediate  hypotheses  are  related  by 
v  »  Q  o  g. 

If  Q  is  a  null  operation  (identity  function),  the  intermedi¬ 
ate  hypotheses  and  verification  conditions  of  the  two  techniques 
are  identical.  A  significant  difference  between  the  two  tech¬ 
niques,  however,  can  be  seen  by  examining  the  case  where  K  is  a 
null  operation.  If  the  loop  is  closed  for  D(f),  subgoal  induc¬ 
tion  enjoys  an  advantage  since  f  can  be  used  as  the  intermediate 
hypothesis.  That  is,  the  subgoal  induction  verification  condi¬ 
tions  are  simply 

Si" :  X  6  D(f)  &  ”B  (X)  ->  Q  ( X)  »  f  ( X) 

S 2" :  X  e  D(f)  &  B(X)  ->  f(H(X))  *  f(X). 

In  the  functional  case,  one  must  still  derive  an  hypothesis 
for  the  loop  function  g.  A  heuristic  which  can  be  applied  here 
is  to  restrict  one's  attention  to  functions  which  are  subsets  of 
q**-1  o  f.  However,  it  is  worth  emphasizing  that  this  rule  need 
not  completely  specify  g  since,  in  general,  Q**-l  of  is  not  a 
function.  Once  g  has  been  selected,  the  verification  conditions 
are 

Pi's  X  6  D(g)  &  ~B(X)  ->  g ( X) *X 

F2":  X  e  D(g)  &  B  (X)  ->  g(H(X))*g(X) 

F  3" :  X  6  D(f)  ->  f(X)  -  Q(g(X)). 

The  difference  between  the  two  techniques  in  this  case  is 
due  to  the  prime  program  decomposition  nature  of  the  functional 
correctness  algorithm  described  in  Section  2.1.  A  more  efficient 
proof  is  realized  by  treating  the  loop  and  the  function  Q  as  a 
whole.  Accordingly,  correctness  rules  for  this  program  form 
might  be  incorporated  into  the  prime  program  functional  correct¬ 
ness  method  described  earlier.  The  validity  of  these  rules  can 
be  demonstrated  in  a  .manner  quite  similar  to  the  proof  of  Theorem 
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2.1 


Example  2.8  -  We  wish  to  sho^  the  program 

while  x  £  { 0,1 ,2 ,3}  do 

if  x  <  0  then  x  :■  x  +  4 

else  x  :=»  x  -  4  fi 

Od; 

if  x  >  1  then  x  :=  x  -  2  f i 

confutes  the  function  f»{ (<x> , <0DD(x) >) } .  The  Subgoal  induction 
verification  conditions  are 

x  6  |0, 1,2, 3}  ->  Q(x)  *  ODD(x)  ,  and 
x  £  10,1,2,3}  ->  ODD(H  (x) )  »  ODD(x)  ,  where 

Q(x)  *  if  x  >  1  then  x-2  else  x,  and 
i  H(x)  *  if  x  <  0  then  x+4  else  x-4. 

Both  these  conditions  are  straightforward.  Now  let  us  consider 
the  prime  program  functional  case.  Suppose  we  are  given  (or  may 
derive)  the  intended  loop  function 

I  g  *  { (<x0> , <x> )  |  x  e  {0,1,2, 3}  &  x  mod  4  *  xO  mod  4}. 

We  can  verify  that  the  loop  computes  g  by  demonstrating  PI'  and 
F2'.  Condition  F3'  uses  g  to  complete  the  proof. 

The  difficulty  with  splitting  up  the  program  in  this  example 
is  that  it  requires  the  verifier  to  "dig  out"  unnecessary  details 
concerning  the  effect  of  the  loop.  One  need  not  determine  expli¬ 
citly  the  function  computed  by  the  loop  in  order  to  prove  the 
<*  program  correct.  The  only  important  loop  effect  (as  far  as  the 

correctness  of  the  program  is  concerned)  is  x  e  (0,1,2, 3}  and 
ODD ( x)  *  ODD ( xO ) .  In  this  example,  treating  the  program  as  a 
'  whole  appears  superior  since  it  only  tests  for  the  essential 

characteristics  of  the  program  components. 

It  is  worth  observing  that,  provided  the  loop  is  closed  for 
D(f),  an  inductive  assertion  proof  of  a  program  of  this  form 
!  could  be  accomplished  by  using  the  loop  invariant  f(X)  *  f(X0). 

The  verification  conditions  in  this  case  vrould  be  equivalent  to 
'  the  subgoal  induction  verification  conditions.  Note  that,  in 

general  (as  in  our  example),  f(X)  -  f(X0)  is  too  weak  an  invari- 
^  ant  to  be  g-adequate  for  the  intended  loop  function  g  above. 

%  i  J2.5.  initialized  Loops 

The  preceding  section  indicates  that  it  is  occasionally 
advantageous  to  consider  a  program  as  a  whole  rather  than  to  con¬ 
sider  its  prime  programs  individually.  In  this  section  we 
^  attempt  to  apply  the  same  philosophy  to  the  initialized  loop  pro- 

r  gram  form  and  use  the  result  as  a  basis  from  which  to  compare  the 

functional  and  inductive  assertion  approaches  to  this  particular 
/  verification  problem. 


'I  *0.1 


We  will  again  consider  the  program  in  Figure  2.3  with  the 
understanding  that  Q  is  a  null  operation  (identity  function),  We 
want  to  prove  that  the  program  computes  a  function  f,  i.e.  that  f 
holds  fbr  all  PI  to  P3  paths.  We  have  seen  that  prime  program 
functional  correctness  involves  an  induction  on  the  P2  to  P3  exe¬ 
cution  path  using  an  intermediate  hypothesis  g.  An  inductive 
assertion  proof  would  involve  an  induction  on  the  Pi  to  P2  execu¬ 
tion  path  using  some  limited  loop  invariant  A(X0,X)  [Linger, 
Mills  &  Witt  79]  .  A  limited  loop  invariant  differs  from  those 
discussed  previously  in  that  it  takes  into  account  the  initiali¬ 
zation  preceding  the  loop.  One  of  the  objectives  of  this  section 
is  to  discuss  the  relative  difficulties  of  synthesizing  the 
intermediate  hypotheses  g  and  A. 

Ws  now  reason  about  whether  there  might  be  an  efficient  way 
to  verify  the  program  by  treating  it  as  a  whole  (i.e.  rather  than 
treating  the  initialization  and  the  loop  individually).  In  order 
for  the  program  to  confute  f,  it  must  be  that  K(X)*R(Y)  -> 
f(X)*f(Y).  Consequently,  the  relation  represented  by  f  o  (R**-l) 
is  a  function  and  is  a  candidate  for  the  intermediate  hypothesis 
g.  Indeed,  the  initialized  loop  program  is  correct  with  respect 
to  f  iff  g  *  f  o  ( K**-l)  is  a  function  and  the  WHILE  loop  (by 
itself)  is  correct  wrt  g.  Unfortunately,  the  domain  of  this 
function  is  the  image  of  D(f)  through  R,  and  since  the  purpose  of 
the  initialization  is  often  to  provide  a  specific  "starting 
point"  for  the  loop,  the  loop  will  seldom  be  closed  for  the 
domain  of  this  function.  Thus  the  problem  of  finding  an 
appropriate  g  can  be  thought  of  as  one  of  generalizing  f  o  (R**- 
1). 

Example  2.9  -  We  want  to  show  the  program 

s  :*  0;  i  s»  0? 

while  i  <  n  do 

i  :■  i  +  1; 

s  ;»  s  +  a[  i] 

od 

computes 

f  *  s  :*SUM(  k,  1  ,n,  a[k]  )#sf  . 

As  before,  SUM( k,  1  ,n, a[ k] )  is  a  notation  for  a[l]  +  a[2]+  ... 
+  a(n]  .  If  K  represents  the  function  performed  by  the  initializa¬ 
tion,  f  o  ( R**-l)  is 

(s-0,i»0  ->  s  :«SUM(k,l,n,a[k] )  )#s#. 

Note  that  the  loop  is  not  closed  for  the  domain  of  this  function. 
To  verify  the  program  using  the  functional  method,  this  function 
must  be  generalized  to  a  function  such  as 

g  *  s  :«s+SUM(k,  i+l,n,a[k]  )fst . 

We  now  consider  the  relative  difficulties  of  synthesising  a 
suitable  loop  function  g  (for  a  functional  proof)  and  synthesiz¬ 
ing  an  adequate  limited  loop  invariant  (for  an  inductive  asser¬ 
tion  proof)  .  If  we  have  a  satisfactory  g  for  a  functional  proof 
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of  the  program,  the  analysis  in  Section  2.2  indicates  that  the 
invariant  A(X0,X)  <->  g(X)-g(X0)  over  D(g)  can  be  used  to  show 

the  loop  confutes  g;  absorbing  the  initialization  X:-K(X)  into 
the  invariant  gives  the  result  that  the  limited  invariant  A(X0,X) 
<->  g(X)*g(K(X0) )  can  be  used  to  prove  the  initialized  loop  pro¬ 
gram  confutes  g  o  K  ■  f.  We  now  try  to  go  the  other  way.  Sup¬ 
pose  we  have  an  appropriate  limited  loop  invariant  A(X0,X)  for  an 
inductive  assertion  proof  of  the  program,  can  we  derive  from  that 
an  adequate  loop  function  g?  We  motivate  the  result  as  follows: 
we  could  obtain  an  equivalent  program  by  modifying  the  initiali¬ 
zation  to  (nondeterministically)  map  XO  to  X  if  A(X0,X)  is  true. 
The  modified  program  (assuming  termination)  must  still  confute 
the  same  function;  if  the  initialization  maps  XO  to  anything 
other  than  K(X0),  the  effect  will  simply  be  to  alter  the  number 
of  iterations  executed  by  the  loop.  By  the  same  argument  that 
was  used  to  show  that  the  loop,  assuming  correctness,  must  com¬ 
pute  f  o  ( K**-l) ,  the  loop  must  also  compute  f  o  (A  (XO  ,x)  **-l) . 
That  is,  if  A(X0,X)  holds  for  some  XO  B  D(f)  and  for  some  X,  the 
loop  must  map  X  to  f(X0).  Note  that  the  loop  is  necessarily 
closed  for  the  domain  of  this  function;  otherwise  the  invariant 
would  be  violated.  The  proper  conclusion  is  that  the  synthesis 
of  an  adequate  loop  function  and  the  synthesis  of  a  suitable 
invariant  are  equivalent  problems  in  the  sense  that  a  solution  to 
one  problem  implies  a  solution  to  the  other  problem. 

Example  2.9  (continued)  -  An  inductive  assertion  proof  of 
our  program  might  use  the  limited  invariant  s»SUM(k,l,i,a[k] )  & 
0<»i<«n.  Note  that  this  invariant  implies  the  invariant 
g  (K(X0)  )»g(X)  discussed  above  (where  g  and  K  are  as  defined  pre¬ 
viously).  Using  the  technique  outlined  above,  we  may  derive  from 
this  invariant  the  loop  function 
g-  =  (s*SUM(k,l,i,a[k] ) ,  0<*i<*n  ->  s  :*SUM(  k,l  ,n,  a[k]  ) )  #s#  . 
Observe  that  this  is  quite  different  from  the  original  g,  but 
that  g'  is  quite  satisfactory  for  a  functional  proof  of  correct¬ 
ness.  It  may  seem  puzzling  that  g',(X0)*g'(X)  is  the  constant 
invariant  TRUE  over  the  set  D(g')  and  yet  Theorem  2.2  states  that 
such  an  invariant  must  be  g'-adequate.  This  is  not  a  contradic¬ 
tion,  however,  since 

TRUE,  i>»n  ->  s»SUM(k,l,n,a[k]  ) 

is  valid  for  any  state  in  Dig")  .  Similarly,  a  functional  proof 
that  the  loop  computes  g"  is  trivial  with  the  exception  of  veri¬ 
fying  that  the  closure  requirement  is  satisfied.  This  is  no 
coincidence:  proving  closure  is  equivalent  to  demonstrating  the 
validity  of  the  loop  invariant. 

The  translation  between  loop  invariants  and  intermediate 
hypotheses  in  a  subgoal  induction  proof  is  discussed  in  [Norris  & 
Wegbreit  77,  King  80]  .  In  Chapter  3,  we  propose  a  new  verifica¬ 
tion  strategy  for  initialized  loop  programs  which  is  based  on  the 
above  mentioned  notion  of  a  nondeterministic  loop  initialization. 


-22- 


2.6 .  Discussion 


Our  purpose  in  this  chap te r  has  been  to  explain  the  func¬ 
tional  verification  technique  in  light  of  other  program  correct¬ 
ness  theories.  The  functional  technique  is  based  on  Theorem  2.1 
which  provides  a  method  for  proving  /disproving  a  loop  correct 
with  respect  to  a  functional  specification  when  the  loop  is 
closed  for  the  domain  of  the  function. 

In  Theorem  2.2,  a  loop  invariant  derived  from  a  functional 
specification  is  shown  to  be  the  weakest  invariant  over  the 
domain  of  the  function  which  can  be  used  to  test  the  correctness 
of  the  loop.  Theorem  2.3  indicates  that  the  functional  correct¬ 
ness  technique  for  loops  is  actually  the  special  case  of  the 
inductive  assertion  method  that  results  from  using  this  particu¬ 
lar  loop  invariant  as  an  inductive  assertion.  The  significance 
of  this  observation  is  that  the  functional  correctness  technique 
for  loops  can  be  viewed  either  as  an  alternative  verification 
procedure  to  the  inductive  assertion  method  or  as  a  heuristic  for 
deriving  loop  invariants. 

The  subgoal  induction  technique  seems  quite  similar  to  the 
functional  method;  the  two  techniques  often  produce  identical 
verification  conditions.  We  have,  however,  observed  an  exanple 
where  the  subgoal  induction  method  appears  superior  to  functional 
correctness  based  on  prime  program  decomposition.  In  the  follow¬ 
ing  chapter,  the  idea  of  a  proof  transformation  is  introduced  and 
is  proposed  as  a  decomposition  strategy  which  overcomes  this 
deficiency  in  prime  program  deconposit ion. 

We  have  examined  the  inductive  assertion  and  functional 
methods  for  dealing  with  initialized  loops.  We  have  shown  that 
the  problems  of  finding  a  suitable  loop  invariant  and  finding  an 
adequate  loop  function  are  essentially  identical.  The  result 
indicates  that  for  this  class  of  programs,  the  two  methods  are 
theoretically  equivalent;  that  is,  there  is  no  theoretical  jus¬ 
tification  for  selecting  one  method  ever  the  other. 

In  Chapter  4,  we  deal  with  the  functional  correctness 
requirement  that  each  program  loop  be  documented  with  an 
appropriate  intended  function.  In  order  to  alleviate  this  prob¬ 
lem,  a  heuristic  technique  is  proposed  which  can  be  used  as  an 
aid  in  ascertaining  undocumented  intended  functions  for  program 
loops. 


We  explained  in  Section  2.5  that  it  is  possible  to  obtain  a 
loop  function  for  a  loop  preceded  by  initialization  based  on  the 
initialization  and  the  program  specification.  This  function, 
however,  usually  has  a  restricted  domain  and  must  then  be  gen¬ 
eralized  to  meet  the  closure  requirement  of  Theorem  2.1.  In 
Chapter  5,  we  discuss  a  class  of  loops  for  which  these  generali¬ 
zations  may  be  obtained  in  a  systematic  manner. 
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_3.  A  New  Verification  Strategy  For  Iterative  Programs 

The  difficulties  associated  with  the  general  verification  of 
computer  programs  are  well  known.  Chief  among  these  difficulties 
is  the  problem  of  creating  a  suitable  inductive  hypothesis  for 
programs  which  use  iteration  or  recursion.  A  large  number  of 
guidelines  or  heuristics  for  the  construction  of  these  hypotheses 
have  appeared  in  the  literature.  Particularly  promising  are 
results  along  the  lines  of  Theorem  2.1  [Mills  75,  Basu  &  Misra 
75,  Wegbreit  77,  Morris  &  Wegbreit  77]  which  show  that  for  a  par¬ 
ticular  class  of  program/specification  pairs,  an  appropriate 
inductive  hypothesis  may  be  obtained  directly  from  the  specifica¬ 
tion.  The  direct  consequence  of  these  results  is  that  for  a 
verification  problem  in  this  class,  the  program  can  be 
proven/disproven  correct  with  respect  to  its  specification 
(assuming  termination)  by  testing  several  verification  conditions 
based  on  the  specification  and  characteristics  of  the  program 
components.  unfortunately,  this  class  is  rather  restricted  and, 
as  a  result,  many  verification  problems  which  arise  naturally  in 
practice  are  not  covered  by  these  results. 

In  this  chapter,  a  verification  strategy  is  described  which 
is  based  on  the  idea  of  applying  a  correctness/incorrectness 
preserving  transformation  to  the  program  under  consideration. 
The  motivation  behind  the  transformation  is  to  produce  a  verifi¬ 
cation  problem  such  that,  in  a  manner  similar  to  that  in  the 
above  mentioned  work,  an  inductive  hypothesis  can  be  directly 
obtained.  Thus  we  are  proposing  replacing  the  problem  of  syn¬ 
thesizing  an  inductive  hypothesis  with  one  of  discovering  an  ade¬ 
quate  correctness/incorrectness  preserving  transformation.  In  a 
number  of  examples  we  have  studied,  the  latter  problem  appears  to 
be  more  tractable. 

In  the  remainder  of  this  section  we  discuss  a  general  verif¬ 
ication  problem  which  occurs  often  in  practice  and  then  suggest 
the  idea  of  a  transformation  as  a  means  to  solve  the  problem.  A 
number  of  heuristics  for  discovering  an  appropriate  transforma¬ 
tion  are  given  in  Section  3.1  and  are  illustrated  with  examples. 
In  Section  3.2,  the  proposed  solution  is  compared  and  contrasted 
with  the  inductive  assertion  and  subgoal  induction  approaches  to 
the  verification  problem.  Finally,  the  application  of  the  tech¬ 
nique  to  more  complex  program  forms  is  discussed  in  Section  3.3. 

In  our  analysis,  we  will  consider  a  program  P  of  the  form 

X  K(X)  ; 

while  B (X)  do 
T~ s*  H (XT” 
od; 

X  :«  Q(X) 


where  X  is  the  program  data  state,  K,  H  and  Q  represent  data- 
8 tate -to -data -st ate  functions  and  B  is  a  predicate  on  the  data 


state.  For  the  present  we  assume  each  of  the  functions  and  the 
predicate  is  explicitly  known;  this  requirement  regarding  the 
function  Q  is  relaxed  in  Section  3.3.  We  assume  that  the  program 
specification  is  formulated  as  a  function  mapping  the  input  data 
state  to  the  portion  of  the  output  data  state  which  corresponds 
to  the  program  variables  whose  final  values  are  of  interest,  we 
will  use  a  function  e  which  extracts  from  any  data  state  the  data 
state  portion  corresponding  to  these  variables.  Thus  for  this 
verification  problem,  P  is  correct  with  respect  to  its  specifica¬ 
tion  function  f  if  and  only  if  for  any  X  in  D(f),  [P]  (X)  is 
defined  and  f  (X)  »e(  [P]  (X) ) . 

Let  TAIL  be  the  portion  of  P  which  follows  the  initializa¬ 
tion  X  :«  K { X) ,  i.e.  TAIL  is  the  WHILE  loop  followed  by  the 
assignment  X  :*  Q(X).  We  begin  with  the  followings 

Observation  1  -  P  is  correct  wrt  f  iff  TAIL  is  correct  wrt 
the  input /output  relation 

g  *  { (x, y)  |  ho  0  n(f)  y  (K(xo)-x  &  f(xo)-Y)}. 

That  is,  P  is  correct  wrt  f  iff  on  data  states  X  which  are 
output  by  K  on  input  XO  e  D( f)  ,  TAIL  produces  f(X0).  For  the 
moment,  we  make  the  following  two  assunptions: 

FUNCTION:  The  relation  g  is  a  function,  i.e. 

K(X0)«K(X1)  ->  f(X0)«f(Xl) 
for  all  XO,  XI  €  0(f)  ,  and 

CLOSURE:  The  WHILE  loop  is  closed  (see  Section  2.1)  for 

the  domain  of  g,  i.e. 

K(xo)*x,  b (x)  ->  4  xi  e  D(f)  y  k(xd-h(x) 

for  all  XO  0  D(f)  [2]  . 

Given  these  two  assunptions,  it  is  easy  to  state  the  neces¬ 
sary  and  sufficient  verification  conditions  for  the  correctness 
of  TAIL  (assuming  termination)  wrt  g.  These  are  (adapted  from 
[Mills  75,  Basu  &  Misra  75,  Morris  &  Wegbreit  77]) 

X  0  D(g)  ,  B  (X)  ->  g(X)-g(H(X)) 

X  e  D(g) ,  ~B  (X)  ->  g(X)»e(Q(X)) 

and  are  called  the  iteration  and  boundary  conditions,  respec¬ 
tively,  in  [Misra  78]  .  They  are  equivalent  to 
XO  e  D(f),  K(X0)»X,  B  (X)  ->  g(X)«g(H(X)) 

XO  S  D(f),  K(XO)«X,  ”B  (X)  ->  g(X)«e(Q(X)) 
i.  e. 

XO  0  D(f),  B  ( K  ( XO ) )  ->  g(K(XO))-g(H(K(XO))) 

XO  e  D(f),  ~B  ( K ( XO ) )  ->  g(K(XO)  ) * e ( Q ( K ( XO )  ) ) 
i.  e. 


[2]  Throughout  this  dissertation,  we  frequently  use  the  nota¬ 
tion 

PI,  P2,  . . . ,  PN  ->  B 
for  the  slightly  longer 

P1&P2&...  &  PN  ->  B. 
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X0,X1  e  D(f),  B  (K(XO) )  ,  K(Xl)*H  (K(XO)  )  ->  g(K(XO) )-g(K(Xl) ) 

XO  e  D(f)  ,~B(K(XO)>  ->  g(K(XO) )»e(Q(K(XO) ) ) 

i.  e. 

ITERATION: 

XOfXl  e  D(  £)  ,  B  (K(  XO )  )  ,  K  ( Xl )  *H  ( K  ( XO )  )  ->  f(XO)»f(Xl) 

BOUNDARY : 

XO  6  D(f)  ,  ~B  (K(X0) )  ->  f (XO)-e(Q(K(XO))). 

We  conclude  as  follows: 

Observation  2  -  Suppose  FUNCTION  and  CLOSURE  hold.  P  is 
correct  wrt  f  (assuming  termination)  iff  ITERATION  and  BOUNDARY 
hold. 

Example  3.1  -  Consider  the  following  program 

{a>*0} 
a  :=  a/2; 

while  a  jz  {0,1}  do 
a  :»  a  -  2 
od ; 

a  :■  (if  a»0  then  1  else  0) 

{a*EVEN(aO/2)n 

In  this  program,  the  data  state  consists  of  the  single  integer 
variable  a  and  will  be  represented  by  <a> .  The  function  EVEN  (a) 
appearing  in  the  program  postcondition  returns  1  if  its  argument 
is  even  and  0  otherwise.  The  specification  function  f  for  this 
program  is 

f(<a0>)»a  <->  a0>«0  &  a»EVEN(aO/2) 
which  implies 

<a0>  6  D(  f)  <->  a0>*0. 

This  program  relates  to  the  general  program  form  above  as  follows 
K(<a0>)»<a>  <->  a*a0/2 
B ( <a> )  <->  a  £  {0,1} 

H(<a0>)*<a>  <->  a-aO-2 

Q(<a0>)»<a>  <->  (a0*0  &  a»l)  OR  (aO^O  &  a«0) 
e(<a>)»a. 

Note  that  assulqptions  FUNCTION  and  CLOSURE  hold.  If  the  program 
terminates  for  all  inputs  in  D(f),  it  is  correct  wrt  its  specifi¬ 
cation  iff  conditions  ITERATION  and  BOUNDARY  hold.  These  can  be 
wr  itten 

a0>«0,  al>«0,  aO/2  £  {0,l},  al/2-a0/2  -  2 

->  EVEN  (aO/2)»EVEN  (al/2) 

and 

a0>«0,  aO/2  e  {0,1}  ->  EVEN  (aO/ 2)- (if  a0/2-0  then  1  else  0) 
respectively. 

At  this  point  we  stop  to  consider  the  assumptions  we  have 
made.  Suppose  assumption  FUNCTION  is  false,  but  CLOSURE  holds. 
Thus  we  have  XO,  Xl  €  D(f)  satisfying 
K(X0)«K(X1)  &  f(X0)*f (XI). 

Now  if  B ( K ( XO ) ) ,  then  K(X2)«H  (K(X0) )  for  some  X2  6  D(f)  by  CLO¬ 
SURE.  If  ITERATION  was  valid,  we  would  have  f(X0)«f(X2)  as  well 
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as  f(Xl)«f(X2).  Since  this  contradicts  our  hypothesis,  ITERATION 
does  not  hold.  On  the  other  hand,  suppose  ~B(K(X0)).  If  BOUN¬ 
DARY  held,  we  would  have  f  (XO)-e(Q(K(XO) ) )  as  well  as 
f (Xl)»e(Q(K(Xl) ) ) .  The  contradiction  leads  us  to  conclude  that 
if  FUNCTION  is  false,  but  CLOSURE  holds,  (at  least)  one  of  ITERA¬ 
TION  and  BOUNDARY  is  false.  Since  FUNCTION  being  false  implies  P 
is  not  correct  wrt  f  (i.e.  [P]  (XO)*  [P]  (Xl) ,  but  f(X0)^f  (XI) ) , 

this  leads  us  to  the  following: 

Observation  3  -  Suppose  CLOSURE  holds*  P  is  correct  wrt  f 
(assuming  termination)  iff  ITERATION  and  BOUNDARY  hold. 

Unfortunately,  it  is  much  more  difficult  to  deal  with  the 
reliance  on  assumption  CLOSURE.  We  can,  however,  gain  insight 
into  the  problem  by  studying  CLOSURE  and  its  relation  to  condi¬ 
tion  ITERATION.  Imagine  P  executing  on  some  input  in  D(f)  and 
consider  the  sequence  of  intermediate  data  states  on  which  the 
predicate  B  is  evaluated.  CLOSURE  requires  that  each  of  these 
intermediate  states  be  "reachable"  through  the  function  K  for 
some  input  element  of  D(f).  It  is  this  reachability  that  enables 
condition  ITERATION  to  test  whether  this  sequence  of  states  stays 
on  the  right  "track,"  or  more  specifically,  that  the  inverse 
images  of  these  states  through  K  remain  in  the  same  level  set  of 
f. 

Most  often  in  practice,  however,  the  purpose  of  the  initial¬ 
ization  K  is  to  "constrain"  the  data  state,  thereby  providing 
some  specific  "starting  point"  for  the  loop  execution.  In  this 
case,  the  intermediate  data  states  will  not  be  reachable  through 
K  and,  as  a  result,  condition  ITERATION  may  well  hold  for  the 
simple  reason  that  it  is  vacuously  true,  i.e.  the  term 
K(X1)»H  (K(X0) )  will  be  false  for  all  XI.  Rather  than  abandoning 
this  approach  to  the  verification  of  P,  however,  the  solution 
suggested  here  is  to  "correct"  the  problem  and  proceed. 

Consider  substituting  for  K  a  suitable  replacement  initiali¬ 
zation  K".  By  "suitable"  here,  we  mean  that  the  output  of  K' 
must  be  sufficiently  unconstrained  so  that  CLOSURE  holds  for  K', 
and  furthermore,  that  the  substitution  preserves  the  termination 
and  correctness/incorrectness  properties  of  the  program  P.  The 
following  definition  formalizes  this  idea. 

Definition  J,1  -  Let  P'  be  the  program  P  with  the  initiali¬ 
zation  K  replaced  by  K".  K"  is  a  reduction  hypothesis  iff  CLO¬ 
SURE  holds  for  K'  and  each  of 

TERMINATES:  For  inputs  in  D(f),  P  terminates  ->  P'  terminates, 
and 

PRESERVES:  P  is  correct  wrt  f  iff  P'  is  correct  wrt  f 
is  satisfied. 

The  significance  of  the  definition  is  that  once  a  reduction 
hypothesis  K'  has  been  located,  we  can  prove/disprove  the 
correctness  of  the  original  program  by  proving  its  termination 


and  verifying  ITERATION  and  BOUtDARY  with  K*  substituted  foe  K 


The  proposed  solution  of  finding  a  reduction  hypothesis  is 
analogous  to  finding  an  adequate  loop  invariant  for  an  inductive 
assertion  proof  [Hoare  69]  ,  or  finding  an  appropriate  loop  func¬ 
tion  for  a  functional  (Mills  75]  or  subgoal  induction  (Morris  & 
Wegbreit  77]  proof.  We  justify  proposing  an  alternative  to  these 
standard  techniques  for  the  following  masons: 

a)  a  reduction  hypothesis  has  a  uni^  intuition  behind 

it;  in  a  number  of  cases/  this  leads  rather  natu¬ 
rally  to  a  solution, 

b)  the  class  of  reduction  hypotheses  bears  an  interesting 

relationship  to  the  class  of  adequate  loop  invariants 
for  P, 

c)  unlike  the  technique  of  inductive  assertions,  it  is 

possible  to  disprove  an  incorrect  program  without 
considering  the  program  beyond  the  loop,  i.e.  by  dis- 
p  rov ing  ITERATION , 

d)  in  the  case  there  is  no  initialization  (i.e.  K  is  the 

identity  function)  and  the  loop  is  closed  for  D(f) 

(or,  more  generally,  any  time  CLOSURE  holds  for  K) ,  a 
very  efficient  proof  results  since  K  itself  is  a  re¬ 
duction  hypothesis,  and 

e)  the  reduction  hypothesis  solution  provides  continuity 

between  the  cases  where  CLOSURE  does  and  does  not 
hold;  an  understanding  of  "why"  CLOSURE  does  not 
hold,  for  example,  can  provide  insight  into  how  to 
create  K*  from  K. 

3.1.  Constructing  a  Reduction  Hypothesis 

In  this  section  we  will  consider  several  heuristics  for 
creating  a  reduction  hypothesis  for  P  and  will  illustrate  their 
use  on  example  programs.  We  begin  with  some  preliminary  remarks. 

In  the  discussion  in  the  preceding  section,  we  assumed  that 
the  program  pieces  corresponding  to  K,  H  and  Q  were  determinis¬ 
tic,  that  is,  we  assumed  their  semantics  (i.e.  their  input/output 
behavior)  could  be  represented  by  data-state -to -data -state  func¬ 
tions.  The  above  results,  however,  extend  in  the  natural  way  to 
the  case  where  subprograms  in  P  are  nondeterministic  provided  one 
switches  to  a  slightly  more  awkward  relational  notation  for 
representing  these  subprograms.  In  particular,  we  will  be 
interested  in  the  case  where  the  initialization  is  nondefcerminis- 
tic  since  it  turns  out  that  this  kind  of  initialization  is  often 
a  reasonable  choice  for  a  reduction  hypothesis.  If  this  new  ini¬ 
tialization  is  represented  by  a  data  -state -to -data  -state  relation 
K',  ITERATION  and  BOUNDARY  translate  to 
ITERATION* • 

X0,X1  6  D(f),  K  *  { XO  ,  X)  ,  B  (X)  ,  K*  ( XI  ,H  (X) )  ->  f(X0)-f(Xl) 
BOUNDARY*! 

XO  6  D(  f)  ,  K '  ( XO  ,  X)  ,  ~B(X)  ->  f  (XO)-e(Q(X) )  , 

where  K*(X0,X)  is  a  notation  we  will  use  which  stands  fbr  (X0,X) 
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e  k". 


Now  suppose  this  relation  K"  satisfies  CLOSURE,  i.e. 

K  "  ( XO  ,  X)  &  B  (X)  ->  4  XI  6  D(f)  }-  K"(Xl,H(X)) 
for  all  XO  0  D(f).  The  program  P"  derived  from  P  by  replacing 
the  initialization  K  with  K",  i.e.  by  replacing 
X  :*  K(X) 

with 


X  :*  "any  Y  satisfying  K"(X,  Y)", 

is  correct  wrt  f  (assuming  termination)  iff  ITERATION"  and  BOUN¬ 
DARY"  hold.  If  K"  has  been  chosen  so  as  to  be  a  reduction 


hypothesis,  the  original  P  is  correct  wrt  f  iff  P  terminates  for 
inputs  in  D(f)  and  ITERATION"  and  BOUNDARY"  hold. 


As  an  aid  in  expressing  nondeterministic  program  segments, 
we  will  use  a  notation  defined  in  [Dijkstra  7  61  •  Specifically, 
an  execution  of  the  program 
if  B1  (X)  ->  PI 
T  B2(X)  ->  P2 


|  Bn(X)  ->  Pn 

fi 

calls“for  the  execution  of  any  single  program  Pi  provided  guard 
Bi(X)  holds. 

we  now  consider  two  opposing  alternatives  to  constructing 
K".  Each  is  based  on  a  different  philosophy  for  insuring  that 
PRESERVES  is  satisfied.  The  first  is  a  progr am-or iented  approach 
and  is  characterized  by  selecting  K"  so  that  the  programs  P  and 
P"  are  equivalent,  i.e.  so  that  P  and  P"  exhibit  identical 
input/output  behavior.  Such  a  K"  trivially  guarantees  PRESERVES 
is  satisfied.  The  other  approach  is  a  specification-oriented 
approach  and  is  characterized  by  selecting  K-*  as  a  superset  of  K 
in  such  a  way  that  executions  of  P"  which  use  the  extended  aspect 
of  the  initialization  are  guaranteed  to  be  correct,  i.e.  execute 
in  accordance  with  the  program  specification.  Such  a  K"  satisfies 
PRESERVES  since  the  correctness  of  P"  implies  the  correctness  of 
P  (since  K"  is  a  superset  of  K)  and  the  correctness  of  P  implies 
the  correctness  of  P"  (the  additional  execution  paths  in  P"  are 
known  to  be  correct)  .  Thus  each  approach  chooses  a  different 
technique  aimed  at  meeting  PRESERVES.  If,  in  addition,  CLOSURE 
holds  for  this  new  initialization  and  TERMINATES  is  satisfied, 
then  K"  is  a  reduction  hypothesis. 

We  will  begin  by  considering  several  program-oriented 
heuristics  and  will  make  use  of  the  following  definition. 

Definition  3.2  -  Let  G  be  a  data-state-to-data-state  rela¬ 
tion  and  let  P*  Be  the  program  P  with  initialization  K  replaced 
by  G.  G  is  an  alternative  initialization  to  K  iff  P  and  P"  have 
identical  input/output  behavior  ter  inputs  in  both  D(f)  and  D(G) . 
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Thus  i£  6  is  an  alternative  initialization  to  K,  then,  for  a 
restricted  set  of  inputs,  G  can  be  used  in  place  of  K  without 
affecting  the  externally  observable  behavior  of  P.  In  what  fol¬ 
lows,  we  will  make  use  of  the  following  properties  which  are 
derived  trivially  from  this  definition: 

-  X  is  an  alternative  initialization  to  itself, 

-  the  union  of  any  number  of  alternative  initializations 

is  an  alterative  initialization,  and 

-  any  subset  of  an  alternative  initialization  is  an 

alternative  initialization. 

Our  interest  in  alternative  initializations  is  due  to  the  follow- 
ing  theorem. 

Theorem  J.l  -  Any  alternative  initialization  G  whose  domain 
includes  D(  f)  and  which  satisfies  CLOSURE  is  a  reduction 
hypothesis. 

Proof  -  Let  G  meet  the  conditions  stated  in  the  theorem  and 
let  P'  stand  for  the  program  P  with  initialization  G  substituted 
for  K.  From  the  definition  of  an  alternative  initialization,  P 
and  P'  have  identical  input/output  behavior  over  D(f).  Thus  TER¬ 
MINATES  and  PRESERVES  are  trivially  satisfied  and  G  is  a  reduc¬ 
tion  hypothesis. 

The  implication  of  Theorem  3.1  is  that  a  reduction 
hypothesis  may  be  created  by  discovering  a  suitable  alternative 
initialization.  This  is  the  basis  for  all  program-oriented 
approaches  to  synthesizing  a  reduction  hypothesis.  The  following 
theorem  suggests  three  techniques  for  constructing  alternative 
initializations. 

Theorem  3.2  (Program-Oriented  Heuristics)  -  Let  Kl,  K2  and 
K3  be  any  functions  satisfying 

LONGCUT:  K1(X)-Y  ->  B  (Y)  &  H(Y)»K(X) 

SHORTCUT:  K2(X)*Y  ->  B(K(X))  &  Y»H(K(X)) 

NOCUT:  K3(X)-Y  ->  B(K(X))  &  B  (Y)  &  H(Y)«H(K(X)) 
for  all  X  6  D(f).  Then  each  of  Kl,  K2  and  K3  is  an  alternative 
initialization. 

Proof  -  We  will  make  repeated  use  of  the  WHILE  loop  property 
B  (X)  ->  [TAIL]  (X)«  [TAIL]  (H  (X)  )  . 

Let  X  6  0(f)  and  X  e  D(K1).  Then 

[TAIL]  (Kl(X)  )»  [TAIL]  (H  (Kl  (X)  )  )  »  [TAIL]  (K(X))«[P]  (X)  , 
hence  Kl  is  an  alternative  initialization.  Let  X  @  D(f)  and  X  6 
D(K2).  Then 

[TAIL]  (K2  (X)  )  -  [TAIL]  (H  (K(X)  )  )  ■  [TAIL]  (K(X))-[P]  (X)  , 
hence  K2  is  an  alternative  initialization.  Finally,  let  X  6  D(f) 
and  X  e  D(K3).  Then 

[TAIL]  (K3  (X) ) *  [TAIL]  (H  (K3  (X)  ) )- [TAIL]  (H(K(X))) 

-[TAIL]  ( K  ( X) )  -  [P  ]  (X), 

hence  K3  is  also  an  alternative  initialization. 
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The  labels  on  the  conditions  in  Theorem  3.2  are  motivated  by 
the  effect  replacing  K  with  the  alternative  initializations  has 
on  the  number  of  iterations  of  the  WHILE  loop.  Kl  causes  the 
loop  to  execute  1  additional  iteration,  K2  saves  the  loop  an 
iteration,  and  K3  has  no  effect  on  the  number  of  iterations.  The 
significance  of  the  theorem  is  that  any  combination  of  K,  Kl,  K2 
and  K3  whose  domain  includes  D(  f)  and  which  satisfies  CLOSURE  is 
a  reduction  hypothesis. 

We  now  define  a  spec  if  ication -oriented  heuristic  for  creat¬ 
ing  a  reduction  hypothesis. 

Theorem  3.3  (Spec ification-Or iented  Heuristic)  -  Let  G  be 
any  alternative  initialization  to  K  whose  domain  includes  D(f) 
and  let  K4  be  any  function  satisfying 

VERYSHORTCUT:  K4  (X)  »Y  ->  ~B  (Y)  &  e(Q(  Y)  )=f  (X)  . 

Let  K"  be  the  union  of  G  and  K4  and  suppose  K"  satisfies  CLOSURE. 
Then  K"  is  a  reduction  hypothesis. 

Proof  -  Let  K"  be  as  stated  in  the  theorem  and  let  P"  stand 
for  the  program  P  with  initialization  K>  substituted  for  K.  Let 
X  €  D(  f)  and  suppose  P  terminates  for  input  X.  If  P'  executes  on 
X  and  the  aspect  of  the  initialization  K'  from  G  is  used,  P“  must 
also  terminate  since  G  is  an  alternative  initialization  to  K;  if 
the  aspect  of  the  initialization  K'*  from  K4  is  used,  P"  must  ter¬ 
minate  since  the  output  of  K4  causes  B  to  evaluate  to  false. 
Hence  in  any  case,  P'  terminates  on  the  input  X  and  THlMINATES 
holds.  The  remainder  of  the  proof  consists  of  showing  that 
PRESERVES  is  satisfied.  Suppose  P  is  correct  wrt  f.  Let  X  6 
D(f).  If  P'  executes  on  X  and  the  aspect  of  the  initialization 
K"  from  G  is  used,  then  P ”  is  also  correct  on  the  input  X  since  G 
is  an  alternative  initialization  to  K;  if  the  aspect  of  the  ini¬ 
tialization  K'  from  K4  is  used,  then  P'  must  produce 
e(  [TAIL]  (K4  (X)  )  )  -e(  Q(  K4  (X)  )  )  -f  (  X)  , 
hence  P~  is  again  correct  on  the  input  X.  Thus  in  any  case,  the 
program  P'  is  correct  wrt  f.  To  show  the  converse,  suppose  P'  is 
correct  wrt  f.  We  must  show  that  this  implies  P  is  correct  wrt 
f.  Let  X  0  D(f).  Then  X  6  D(G)  .  By  the  correctness  of  P'  and 
the  fact  that  K"  includes  G,  the  program  P"  derived  from  P  by 
replacing  the  initialization  K  with  G  produces  f(X)  for  the  input 
X.  Since  G  is  an  alternative  initialization  to  K,  P  and  P" 
exhibit  identical  input/output  behavior  fOr  X,  hence  P  produces 
f ( X)  for  the  input  X  and  thus  P  is  correct  wrt  f.  This  conpletes 
the  proof  of  the  theorem. 

As  in  Theorem  3.2,  the  label  on  the  condition  in  Theorem  3.3 
is  used  to  suggest  the  effect  the  initialization  K4  has  on  the 
execution  of  the  program.  The  output  of  a  function  satisfying 
VERYSHORTCUT  causes  the  predicate  3  to  evaluate  to  false,  and 
consequently,  the  WHILE  loop  in  P  will  execute  zero  times.  we 
repeat  that  the  functions  Kl,  K2  and  K3  represent  program- 
oriented  heuristics  since  they  are  designed  specifically  to 
preserve  the  effect  of  the  program  P  (possibly  over  a  restricted 
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domain).  Function  K4,  on  the  other  hand,  represents  a 
spec ification-or iented  heuristic  since  its  purpose  is  to  insure 
program  behavior  which  is  in  agreement  with  the  program  specifi¬ 
cation.  Mow  that  these  heuristics  have  been  defined,  we  will 
illustrate  their  use  with  a  number  of  examples. 

Example  3 .2  -  This  ex  amp  le  illustrates  a  circumstance  in 
which  assumption  CLOSURE  "almost"  holds  but  "not  quite."  In  this 
situation  it  may  be  clear  how  to  "expand"  the  initialization  in 
order  to  satisfy  CLOSURE  in  such  a  way  that  preserves  the  effect 
of  the  program.  The  program 

{ a>*0} 
a  :*  a*2; 
while  a>  0  do 
a  :  =  a  -  1 ; 
b  :■  b  +  1 
od; 
a  :  =  b 

{a»a0*2  +  bO} 

does  not  satisfy  CLOSURE  since  only  even  values  of  a  are  output 
from  the  initialization  K  and  the  WHILE  loop  is  not  closed  for 
this  set.  This  problem  could  be  "fixed"  by  supplementing  K  with 
another  initialization,  selected  nondeterminist ically,  which  pro¬ 
duced  odd  values  of  a.  The  combined  effect  would  be  an  initiali¬ 
zation  tfiich  was  capable  of  producing  all  (nonegative)  values  of 
a  and  would  consequently  satisfy  CLOSURE.  Based  on  this  reason¬ 
ing,  we  could  supplement  K  with  the  initialization 
a  a*2  +  1, 

but  an  inspection  of  the  loop-body  text  indicates  that  we  must 
compensate  in  this  case  by  subtracting  1  from  b  if  the  effect  of 
the  program  is  to  be  preserved.  This  is  actually  an  application 
of  heuristic  LONGCUT  of  Theorem  3.2;  specifically,  the  initiali¬ 
zation 

Kl  (<a0 ,bO>)=<a, b>  <->  a«a0*2+l  &  b*bO-l. 
satisfies  LOMGCUT  and  is  thus  (by  Theorem  3.2)  an  alternative 
initialization.  In  the  above  notation  for  nondeterminacy,  the 
combined  initialization  may  be  written 
if  TRUE  ->  a  :»  a*2 
"T  TRUE  ->  a  :*  a*2  +  1;  b  b-1 
fi. 

and  the  relation  K'  which  represents  this  initialization  is 
K'(<aO,bO>  ,<a,b>)  <->  ((a»a0*2  &  b»bO)  OR  (a«a0*2+l  &  b»bO-l)). 
Since  K'  is  the  union  of  two  alternative“Tnit ializations  (K  and 
Kl),  K'  itself  is  an  alternative  initialization,  and  thus,  apply¬ 
ing  Theorem  3.1,  it  is  a  reduction  hypothesis.  Hence  the  program 
can  be  verified  correct  with  respect  to  its  specification  (assum¬ 
ing  terminate  5n)  by  showing  that  ITERATION'  and  BOUNDARY'  hold. 
Condition  ITERATION'  has  the  following  two  aspects,  which 
correspond  to  the  cases  where  a  is  even  and  odd,  respectively: 
a0>«0,  al>«0,  a«a0*2,  b»bO,  a>0,  al*2+l«a-l,  bl-l*b+l 

->  aO*2+bO-al*2+bl 
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and 

a0>*0,  al>»0,  a«a0*2+l,  b« bO-1 ,  a>0,  al*2*a-l,  bl»b+l 

->  a0*2+b0-al*2+bl. 

These  can  be  simplified  (by  eliminating  aO,  bO,  al  and  bl) 
to 

a>0  ->  a+b»  (a-2)+  (b+2) 

and 

a>0  ->  (a-l)+  (btl)»  (a-l)+  (btl) , 
both  of  which  hold.  Condition  BOUNDARY'  is 
a0>*0,  a»a0*2,  b-bO ,  a<*0  ->  aO*2+bO»b 
which  simplifies  to 
a*0  ->  a+b*b, 
and  thus  also  holds. 

Example  -  We  noted  above  that  the  purpose  of  loop  ini¬ 
tialization  is-  often  to  set  the  data  state  to  some  spec  if  ic 
"starting  point"  for  the  execution  of  the  loop.  Once  the  loop 
begins  execution,  the  data  state  leaves  this  starting  point  and 
takes  on  more  general  values.  With  this  in  mind,  we  note  that  the 
program  P  is  equivalent  to  the  program 

X  :»  H  ( K ( X)  )  ; 

while  B (X)  do 
X  :*  H  (X) 
od; 

X  :»  Q ( X) 

on  executions  of  P  which  require  at  least  one  loop  iteration. 
Since  the  initialization  in  this  new  program  includes  an  execu¬ 
tion  of  the  loop  body,  the  hope  is  that  the  output  of  the  new 
initialization  will  be  general  enough  to  satisfy  CLOSURE.  This 
observation  is  the  motivation  behind  heuristic  SHORTCUT  in 
Theorem  3.2  and  can  be  applied  as  follows:  convert  (if  necessary) 
the  verification  problem  to  one  over  a  domain  where  the  loop  will 
execute  at  least  once  and  try  the  initialization  H  o  K  for  the 
reduction  hypothesis  K'.  As  an  illustration,  consider  the  pro¬ 
gram 


pa  :»  0; 

while  s  f  NULL  do 

pa  :*  pa  +  head(s)  ? 

s  :*  tail(s) 

od; 

if  pa  <*  0  then  pa  :■  0  else  pa  :■  1  fi 
Tpa»PAVG(sO)l 

which  determines  whether  the  arithmetic  average  of  the  integers 
appearing  in  the  sequence  s  is  positive  or  negative.  The  func¬ 
tion  head(s)  for  a  nonempty  sequence  s  returns  the  lead  element 
in  8,  tail(s)  returns  s  with  head(s)  removed,  and  NULL  denotes 
the  sequence  containing  0  elements.  The  function  PAVG(sO) 
appearing  in  the  postcondition  has  the  value  1  if  the  average  of 
the  elements  of  sO  is  positive  and  0  otherwise.  CLOSURE  is  not 
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satisfied  for  this  program  since  pa  takes  on  values  other  than  0 
as  the  loop  iterates.  Suppose  we  can  convince  ourselves  that  the 
program  executes  correctly  when  the  input  s  is  the  enpty  sequence 
and  the  loop  is  bypassed.  The  remainder  of  the  proof  then  con¬ 
sists  of  verifying  the  program  assuming  the  precondition 
{s/NULL}.  Accordingly,  we  will  now  use  the  program  specification 
function 

f ( <paO ,sO>)*pa  <->  sO^NULL  &  pa-PAVG(sO). 

The  heuristic  suggested  above  is  to  try  the  initialization 
K2  ( <paO  ,s0>  ,  <pa,  s>)  <->  sO^NULL  &  pa-head(sO)  &  s»tail(sO) 
as  a  reduction  hypothesis.  This  function  satisfies  SHORTCUT  and 
is  thus  an  alternative  initialization.  Its  domain  includes  D(f) 
(i.e.  {<pa,  s>  |  s/NULL})  and  since  it  satisfies  CLOSURE,  it  is  a 
reduction  hypothesis.  To  apply  this  result  in  a  demonstration  of 
the  correctness  of  the  program,  we  must  show  ITERATION''  and  BOUN¬ 
DARY  '  hold  (using  K2  for  K')  .  Condition  ITERATION'  is 
sO^NULL,  sl^NULL,  pa»head(sO),  s-tail(sO),  s/NULL, 
pa+head ( s) “head ( si) ,  ta  il(  s)  =ta  il(  si)  ->  PA  VG  ( sO )  «PAVG  ( si ) 
which  simplifies  to 

s^NULL  ->  PAVG(<pa>|  |s)=PAVG(<pa+head(s)>  |  jtail(s)). 
where  <x>  is  the  sequence  containing  the  single  element  x  and  |  | 
denotes  concatenation  of  sequences.  Condition  BOUNDARY'  is 
sO^NULL,  pa-head (sO),  s*tail(sO),  s»NULL 

->  PAVG  (  sO) “POSITIVE  (pa) 

where  POSITIVE  (pa)  is  a  function  which  has  the  value  1  if  pa  is 
positive  and  0  otherwise.  This  condition  simplifies  to 
sO=*<pa>  ->  PAVG  ( sO  )* POSITIVE  (pa)  . 

Example  _3.4  -  We  now  illustrate  an  application  of  the 
specification-oriented  heuristic  VERYSHORTCUT  of  Theorem  3.3. 
Since  any  output  Y  of  a  function  K4  satisfying  VERYSHORTCUT  must 
satisfy  “B  (Y) ,  this  approach  is  most  appropriately  used  when  CLO¬ 
SURE  holds  for  all  loop  iterations  except  the  last,  i.e.  the  only 
lack  of  values  in  the  range  of  K  is  where  B  is  FALSE.  The  idea 
is  to  add  to  the  initialization  the  production  of  these  values  in 
such  a  way  that  assures  the  correct  execution  of  the  program. 
The  following  program  serves  to  illustrate  this  kind  of  tech¬ 
nique.  It  searches  a  linked  list  fOr  an  element  containing  a  key 
field  with  the  value  17: 

found  :■  FALSE; 

while  p  j  NIL  &  “found  do 
IT  p~  .  key*  17  then 
found  :*  TRUE 
else  p  p^.link  fi 
ocf 

{ f  ou  nd- 1 NC  HAI N  ( pO  ,  1 7 ) } 

The  program  notations  p~.key  and  p'.link  are  the  key  and  link 
fields,  respectively,  of  the  node  pointed  to  by  p.  A  link  field 
of  NIL  is  used  to  mark  the  end  of  the  list.  The  function 
INCHAIN  (pO  ,17)  appearing  in  the  postcondition  is  a  predicate 
which  holds  iff  the  chain  pointed  to  by  pO  contains  a  node  with  a 
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key  field  of  17.  Note  that  since  there  is  no  program  text  follow¬ 
ing  the  loop,  Q  is  the  identity  function.  CLOSURE  is  not  satis¬ 
fied  for  this  program  since  the  variable  found  may  take  on  the 
value  TRUE  on  the  last  loop  iteration.  The  initialization  is 
extended  by  alternatively  assigning  the  value  TRUE  to  found,  but 
only  in  the  case  where  INCHAIN (p,  17)  holds.  Specifically,  we 
propose  to  supplement  the  given  initialization  with  the  function 
K4  ( <foundO ,p0> )*< found,  p>  <->  INCHAIN  (pO  ,17)  6  p-pO  &  found 
to  yield  the  new  initialization 
if  TRUE  ->  found  :-  FALSE 
T  INCHAIN (p, 17)  ->  found  :*  TRUE 
fi. 

Note- that  R4  satisfies  VERY SHO RTC UT ,  i.e.  on  executions  of  the 
program  which  follow  the  second  path,  the  loop  body  is  bypassed 
and  the  program  necessarily  behaves  in  accordance  with  its 
specification.  The  combined  initialization  is  represented  by  the 
input/output  relation 

K”(<foundO,pO>,<found,p>)  <-> 

(“found  OR  found* INCHAIN  (pO ,  17) )  &  p-pO  . 

Since  arbitrary  values  of  found  and  p  may  emerge  from  this  ini¬ 
tialization,  CLOSURE  is  satisfied.  By  Theorem  3.3  (using  the 
original  initialization  K  for  G)  ,  K'  is  a  reduction  hypothesis. 
We  now  use  K”  to  state  the  necessary  and  sufficient  conditions 
(assuming  termination)  for  the  correctness  of  the  program. 
Corresponding  to  the  two  paths  through  the  loop  body  of  this  pro¬ 
gram,  there  are  two  ITERATION'*  conditions 

("found  OR  found-INCHAIN  (pO, 17) ) ,  p-pO,  p^NIL,  “found, 
pA. key-17,  (“TRUE  OR  TRUE-INCHAIN(pl ,17) ) ,  p-pl 

~  ->  INCHAIN (pO ,17) “INCHAIN (pi ,17) 

and 

(“found  OR  found* INCHAIN (pO ,17) ) ,  p-pO,  P^NIL,  “found, 

pA.key?17,  ("FALSE  OR  FALSE-INCHAIN (pi ,17) ) ,  p“.link-pl 

->  INCHAIN (pO ,17) "INCHAIN (pi, 17) , 

which  simplify  to 

“found,  p^NIL,  p~ . key- 1 7 ,  INCHAIN(p,17) 

->  INCHAIN(p,17)»INCHAIN(p,17) 

and 

“found,  p^NIL,  pA.key^l7 

->  INCHAIN (p, 17) -INCHAIN (pA. link, 17) , 
respectively.  Condition  BOUNDARY”  is 

(“found  OR  found"INCHAIN(pO,17) ) ,  p-pO,  (p-NIL  OR  found) 

->  found-INCHAIN (pO ,17) , 

which  simplifies  to 

(found  &  INCHAIN (p,  17))  OR  (“found  &  p-NIL) 

->  found-INCHAIN (p, 17) . 

Example  3.5  -  This  example  illustrates  a  circumstance  where 
several  of ' tHe- above  heuristics  are  applied  in  order  to  create  a 
reduction  hypothesis.  The  following  program  sums  the  elements  of 
a  sequence: 
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{ S/NULL} 
sun  :-  0; 
while  sj*NULL  do 

sum  :«  sum  +  head(s) 
s  :»  tail(s) 


od; 

{sum-SUM(sO) } . 


The  notation  SUM(sO)  appearing  in  the  postcondition  stands  for 
the  summation  of  the  elements  of  sO.  We  begin  our  reasoning  as 
follows.  CLOSURE  is  not  satisfied  for  this  program  because  sum 
is  constrained  to  the  value  0  by  an  assignment  statement.  What 
would  be  the  effect  on  the  program  of  removing  this  assignment? 
After  the  first  loop  iteration,  sum  would  have  the  value 
sumO+head ( sO ) ,  rather  than  head(sO).  we  could  compensate  for 
this  discrepancy  by  subtracting  sumO  from  head(sO)  before  the 
loop  begins  execution.  Thus  we  choose  to  replace  *sum:»0"  with 
"head ( s)  : -head (s) -sum.  "  This  is  an  application  of  heuristic  NOCUT 
of  iheorem  3.2.  Indeed,  the  function  corresponding  to  this  new 
init  ialization, 

K3 (<sum0,s0>)»<sum, s>)  <->  sO^NULL  &  s/NULL  &  sum-sumO  & 

head (s) -head (sO)-sumO  &  tail(s)*tail(sO) , 
satisfies  NOCUT  and  is  thus  an  alternative  initialization.  We 
note  that  CLOSURE  is  satisfied  for  this  initialization  on  all  but 
the  final  loop  iteration  (i.e.  K3  cannot  produce  an  output  which 
satisfies  s»NULL) .  As  in  Example  3.4,  the  solution  under  these 
circumstances  is  to  supplement  the  initialization  with  a  function 
satisfying  VERYSHORTCUT;  in  this  example  such  a  function  would  be 

K4 ( <sum0 , s0> ) -<sum, s>)  <->  sum-SUM(sO)  &  s-NULL. 

Note  that  the  output  of  K4  causes  B  to  evaluate  to  FALSE  and 
forces  the  program  to  be  correct  wrt  its  specification.  The  com¬ 
plete  initialization  is  then 

if  sj<NULL  ->  head  ( s)  :-  head(s)  -  sum 
~ f  TRUE  ->  sum  :»  SUM(s)  ;  S  NULL 
fi. 

Let  K”""represent  this  initialization  (i.e.  let  K'  be  the  union  of 
K3  and  K4).  By  Theorem  3.3  (using  K3  for  G) ,  K'  is  a  reduction 
hypothesis.  Corresponding  to  whether  K3  or  K4  is  used  in  the 
term  K' (Xl,H  (X) ) ,  condition  ITERATION'  has  the  following  two 
aspects: 

sO^NULL,  sum-sumO,  head ( s) -head ( sO) -sumO ,  tail(s)«tail(sO) , 
s/NULL,  sl^NULL,  tail(s) /NULL,  sura+head (s) -suml , 
head(  tail(s) ) -head (si)  -suml,  tail(  tail(s)  )«tail(sl) 

->  SUM (sO) -SUM (si) 
and 

sO^NULL,  sum-sumO,  head(s) -head(sO) -sumO,  tail(s)-tail(sO) , 
s/NULL,  sum+head (s) -SUM (si) ,  tail(s)  -NULL 
->  SUM (sO) -SUM (si) 
which  simplify  to 

S^NULL,  tail(s)^NULL  -> 

sum+head  ( s)  -SUM  (tail(s))- 

sum+head (s)+head(  tail (s)  }+SUM(tail(  tail(s) ) ) 

and 
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tail( s)  “NULL  ->  sum+head(s)  +SUM(  tail(s) ) -sura+head  ( s) . 

Condition  BOUNDARY "  is 

Sum«SUM  ( sO ) ,  s«NULL  ->  SUM(sO)*sum. 

3.2.  Relation  to  Standard  Correctness  Techniques 

In  this  section,  we  discuss  the  relationship  between  the 
proposed  verification  strategy  and  the  subgoal  induction  [Morris 
&  Wegbreit  77]  (see  also  [Manna  &  Pnueli  70,  Manna  71]),  and 
inductive  assertion  [Hoare  69]  correctness  techniques.  Me  will 
define  these  methods  in  the  framework  of  the  verification  problem 
described  above,  i.e.  in  each  case  we  wish  to  prove/disprove  the 
program  P  correct  wrt  its  specification  function  f. 

All  three  techniques  call  for  creating  and  verifying  an 
hypothesis  concerning  some  aspect  of  the  behavior  of  P  and  then 
applying  the  hypothesis  to  prove/disprove  the  correctness  of  the 
program.  In  the  proposed  technique,  this  hypothesis  is  a  reduc¬ 
tion  hypothesis;  in  subgoal  induction,  we  will  refer  to  the 
hypothesis  as  a  tail  function;  in  the  inductive  assertion  tech¬ 
nique,  the  hypothesis  is  an  adequate  inductive  assertion. 

A  tail  function  g  in  a  subgoal  induction  proof  is  a  general 
description  of  the  input/output  behavior  of  program  TAIL. 
Specifically,  g  has  the  same  functionality  as  the  specification  f 
and  must  satisfy  each  of 

SI1:  X  e  D(g)  ,  B  (X)  ->  H  (X)  Q  D(g) 

SI2:  X  6  D(g)  ,  B  (X)  ->  g(X) »g(H  (X) ) 

SI 3s  X  S  D(g)  ,  ~B(X)  ->  g(X) *e(Q(X) ) 

SI 4:  X  e  D(g)  ->  TAIL  terminates  with  input  X 

SI  5:  X  e  D(f)  ->  K  ( X)  e  D(g)  . 

The  first  four  of  these  conditions  establish  that  TAIL  is  correct 
wrt  g,  SI 5  assures  that  D(g)  is  sufficient  for  testing  the 
correctness  of  the  program  P.  When  such  a  function  g  has  been 
found,  the  program  P  is  correct  wrt  f  iff 
SI 6:  X  e  D(f)  ->  g(K(X)  )-f  (X)  . 

In  the  inductive  assertion  technique,  an  adequate  inductive 
assertion  is  a  sufficiently  strong  invariant  relation  between 
initial  data  states  and  data  states  occurring  at  the  loop  predi¬ 
cate  B.  For  our  purposes,  an  adequate  inductive  assertion  for 
the  program  P  is  defined  as  a  binary  relation  A  over  the  data 
state  of  P  which  satisfies 
IA1:  X  6  D(f)  ->  A  (X,  K(X)  ) 

IA2:  X  e  D(  f)  ,  A  (X,  Y)  ,  B  (Y)  ->  A(X,H(Y)) 

IA3:  X  6  D(f),  A (X, Y)  ->  TAIL  terminates  with  input  Y 
IA4:  X  e  D(f),  A  (X,  Y)  ,  ~B(Y),  A(X,Z),  ~B  (Z) 

->  e(Q(Y))«e(Q(Z)). 

If  we  view  X  and  Y  as  representing  the  initial  and  current  data 
states,  IA1  and  IA2  prove  that  A(X,Y)  is  a  "loop  invariant,"  IA3 
is  the  necessary  termination  condition  based  on  A,  and  IA4  tests 
whether  A  is  sufficiently  strong  to  verify  the  correctness  of  the 
program.  When  such  a  relation  A  has  been  found,  the  program  P  is 
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correct  wrt  f  iff 

IA5:  X  6  D(f),  A(X,Y)  ,  ~B  (Y)  ->  f  (X)  -e(Q(  Y) )  . 

» 

The  results  of  this  section  are  contained  in  the  following 
two  theorems.  They  define  a  relationship  between  the  three  forms 
of  program  hypotheses  and  give  a  technique  for  transforming  tail 
functions  and  adequate  inductive  assertions  into  reduction 
hypotheses. 

Theorem  3.4  -  Let  g  be  a  tail  function  for  a  subgoal  induc¬ 
tion  proof  of” P7  and  let  K"  be  a  relation  defined  by 
|T(X0,X)  <->  g(K(XO))»g(X) . 

Then  K'  is  a  reduction  hypothesis. 

Proof  -  We  must  prove  that  K"  satisfies  CLOSURE  and  that 
TERMINATES  and  PRESERVES  hold  where  P'  is  P  with  initialization 
K'*  substituted  for  K.  To  see  that  K'  satisfies  CLOSURE,  let 
XO  6  D(f)  &  K”*  (XO  ,X)  &  B  (X) 
i.e. 

xo  e  D(f)  &  g(K(xO))«g(x)  &  B(X). 

By  SI1  and  SI2,  g(X)  «g(H  (X) )  j  hence  g  (K(X0)  )»g(H  (X) )  and 
XO  6  D(f)  &  K-*  (XO  ,H  (X)  )  , 

thus  CLOSURE  holds  for  K'.  TERMINATES  must  be  satisfied  since  the 
range  of  K"  contains  only  elements  of  D(g)  and  SI4  is  satisfied. 
Finally,  to  show  that  PRESERVES  holds,  we  will  prove  that  P  and 
P'  compute  the  same  function  (in  the  variables  of  interest)  over 
the  domain  D(f).  Let  XO  Q  0(f).  On  input  XO,  P  produces  a 
result  Y  satisfying 

K(X0)-X  &  Y»e( [TAIL] (X) ) 

for  some  state  X.  On  input  XO,  P'  produces  a  result  Y'  satisfy¬ 
ing 

K'(X0,X')  &  Y'«e( [TAIL] (X')  ) 
for  some  state  X'.  These  may  be  rewritten 
R(X0)«X  &  Y-g(X) 

and 

g(K(X0))-g(X")  &  Y'-g(X') 

using  the  definition  of  K'*  and  the  fact  that  SI1-SI4  imply  TAIL 
computes  g.  These  imply 
Y*g ( K ( XO ) ) »Y  * , 

hence  P  and  P'  compute  the  same  function  over  0(f). 

Thus  a  tail  function  g  together  with  program  initialization 
K  can  be  used  to  construct  a  reduction  hypothesis.  The  (perhaps) 
surprising  result  in  the  following  theorem  is  that  an  adequate 
inductive  assertion  (by  itself)  is  a  reduction  hypothesis,  i.e. 
the  class  of  reduction  hypotheses  for  P  contains  the  class  of 
adequate  inductive  assertions  for  P. 

Theorem  3.5  -  If  A  is  an  adequate  inductive  assertion  for  P, 
then  A  is  a  reduction  hypothesis  for  P.  , 

Proof  -  We  must  show  A  satisfies  CLOSURE  and  that  TERMINATES 
and  PRESERVES  hold  where  P'  is  P  with  initialization  A 
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substituted  for  K.  To  see  that  A  satisfies  CLOSURE,  let 
XO  6  0(f)  &  A (XO , X)  &  B  (X)  . 

By  IA2,  A  (XO  ,H  (X)  )  ,  thus  CLOSURE  holds  for  A.  TERMINATES  follows 
directly  from  IA3.  Finally,  to  show  that  PRESERVES  holds,  we 
will  show  that  P  and  P'  compute  the  same  function  (in  the  vari¬ 
ables  of  interest)  over  the  domain  D(f).  Let  XO  6  D(f).  On 
input  XO,  P  will  produce  Y  satisfying 
K(XO)«X  &  Y»e{ [TAIL] (X) ) 

for  some  state  X.  By  IA1,  A(XO,X).  By  IA3  the  WHILE  loop  ter¬ 
minates  on  input  X  giving  some  result  T  where  ~B(T).  Repeated 
application  of  IA2  and  the  loop  property 
B(Z)  ->  [TAIL]  (Z)»  [TAIL]  (H(Z)) 

gives  A(XO,T)  and  [TAIL]  (X)  *  [TAIL]  (T)-Q(T) .  Thus  on  input  XO,  P 
produces  Y  satisfying 

(3.1)  A (XO ,T)  &  ~B(T)  &  Y®e(Q(T) ) 

for  some  T.  P'  on  the  other  hand,  with  input  XO  will  produces  Y" 
satisfying 

A(XO  , X"*)  &  Y"*e(  [TAIL]  (X"*)  ) 

for  some  state  X'.  Again  by  IA3,  the  WHILE  loop  terminates  on 
input  X'  giving  some  result  T"  where  ~B  (T4*) .  Repeated  applica¬ 
tion  of  IA2  and  the  above  loop  property  gives  A(XO,T")  and 
[TAIL]  (X')a[TAILl  (T')*Q(T") .  Thus  P'  produces  Y'  satisfying 

(3.2)  AfXO,!')  &  ~B(T')  &  Y'»e(Q(T')) 

for  some  T'.  In  light  of  IA4,  (3.1)  and  (3.2)  imply  Y«Y“,  thus  P 
and  P'  compute  the  same  function  over  D(f). 

Example  3.6  -  To  illustrate  these  ideas,  we  consider  again 
the  program  discussed  in  Example  3.2: 

{a>-0} 
a  :»  a*2; 
while  a>0  do 
a  :»  a -17 
b  j-  b+1 
od; 
a  :*  b 

{a»a0*2  +  bO} . 

The  following  are  possible  tail  functions  for  a  subgoal  induction 
proof  of  the  program 

gl (<aO,bO>)-a  <->  a0>»0  &  a»aO+bO 
g2 (<a0 ,bO>)-a  <->  a-MAX(a0,0)+b0. 

Several  adequate  inductive  assertions  are 
Al(<aO,bO>,<a,b>)  <->  b«b0+2*a0-a  &  a>«0 
A2(<a0,b0> ,<a,b>)  <->  b«b0+2*a0-a  &  a>*0  &  a0>«0 
A3(<a0,b0>,<a,b>)  <->  b«b0+2*a0-a  6  a0>»a>«0 
A4(<a0,b0> ,<a,b>)  <->  b-bO+2*MAX  (aO  ,0) -MAX(a,0) . 

By  ttieorem  3.4, 

Kl(<aO,bO>,<a,b>)  <->  gl(K(<aO,bO>) )«gl(<a,b>) 
and 

K2(<a0,b0>,<a,b>)  <->  g2 (K( <a0 ,b0>) )«g2 (<a,b>) , 
i .  e . 

Kl ( <a0 ,b0> , <a, b>)  <->  2*a0>»0  &  a>«0  &  2*a0+b0«a+b 
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and 

K2 ( <aO , bO> ,  <a ,  b> )  <->  MAX(2*a0,0)+b0«MAX(a,0)+b 
are  reduction  hypotheses.  Theorem  3.5  states  that  each  of  A1-A4 
is  also  a  reduction  hypothesis.  Thus  any  of  these  can  be  used  in 
place  of  K'  in  the  proof  in  Example  3.2.  we  remark  that  the 
relation  K'  used  in  that  example  is  not  an  inductive  assertion, 
hence  the  class  of  adequate  inductive  assertions  is  a  proper  sub¬ 
set  of  the  class  of  reduction  hypotheses. 

3.3.  Proof  Transformations 

Applying  the  verification  technique  proposed  above  based  on 
ITERATION'  and  BOUNDARY'  requires  ascertaining  the  input/output 
behavior  of  the  loop  initialization,  loop  body  and  the  program 
text  following  the  loop.  In  many  cases,  this  ascertaining  pro¬ 
cess  may  be  difficult  (e.g.  if  these  program  segments  contain 
additional  WHILE  loops)  .  In  view  of  this  problem,  we  now  con¬ 
sider  the  situation  which  occurs  when  the  input/output  behavior 
of  the  program  text  following  the  WHILE  loop  is  not  explicitly 
known.  The  verification  problem  under  consideration,  then,  will 
be  of  the  form 

{XO  0  D(f)  &  X=X0} 

X  :*  K (X)  ; 
while  B (X)  do 
X  s«  H  (X) 

od; 

T 

{e(X)»f  (XO)} 

where  R,  B  and  H  are  as  before  and  T  represents  some  unspecified 
block  of  program  text.  Again  our  intention  is  to  prove/disprove 
the  program  correct  wrt  its  specification  function  f. 

Suppose  we  have  a  reduction  hypothesis  K'  for  P  and  can  show 
condition  ITERATION'  holds.  What  sense  does  BOUNDARY'  make  and 
how  can  we  proceed?  In  this  situation,  condition  BOUNDARY' 
corresponds  to  a  new  but  simpler  correctness  problem.  Specifi¬ 
cally,  we  must  prove 

{XO  6  D(f)  &  K'(X0,X)  &  ~B  (X)  } 

T 

(e(X)-f(XO)}. 

This  problem  is  simpler  than  the  original  due  to  the  fact  that 
the  loop  has  been  eliminated  from  the  program.  Thus  we  have  used 
a  reduction  hypothesis  and  condition  ITERATION'  to  transform  the 
correctness  question  for  the  original  program  P  to  a  correctness 
question  for  a  substantially  simpler  program.  If  the  program  T 
contains  further  looping  structures,  the  process  may  be  repeated. 

Example  3.7  -  The  following  program  operates  on  sequences  a, 
x,  y  and  i~  of  natural  numbers.  The  function  head(s)  of  a 
nonempty  sequence  s  is  the  leftmost  element  of  s,  and  tail(s)  is 
s  with  head(s)  removed.  The  infix  operator  ||  denotes 
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concatenation  of  sequences  and  NULL  denotes  the  sequence  with  0 
elements.  The  predicate  odd  (a)  is  true  iff  a  is  odd. 


while  a  ft  NULL  do 

if  odd ( head { a) )  then  y 
else  z 

a  :»  tail  (a) 
od; 

while  y  ft  NULL  do 

x  :■  x  ||  <head(y) >;  y 
od; 

while  z  ft  NULL  do 

x  :■  x  |  |  <head(z)  >;  z 
od 

{x»F  (xO  ,y0  ,z0  ,a0) } 


y  <head(a)> 

z  <head(a)>  f i; 


tail(y) 

tail(z) 


The  function  F  appearing  in  the  postcondition  is  defined  as 
F  (x,y,  z,  a)  *  x||  y  ||  ODDS  (a)  ||  z  ||  EVENS  (a) 

where  ODDS  (a)  is  the  sequence  containing  the  odd  elements  of  a  in 
the  order  they  appear  in  a  (EVENS (a)  is  similar).  In  this  exam¬ 
ple,  T  corresponds  to  the  last  two  WHILE  loops  and  since  there  is 
no  initialization,  K  is  the  identity  function.  Since  CLOSURE 
holds  for  this  function,  K  itself  is  a  reduction  hypothesis. 
Corresponding  to  the  two  paths  through  the  first  loop  body,  there 
are  the  two  ITERATION"  conditions,  namely 
a^NULL,  odd (head (a)) 

->  F(x,y,z,a)»F(x,y|  |  <head(a)  >  ,a,  tail(a) ) 
a/ NULL,  "odd (head (a) } 

->  F(x,y,z,a)-F(x,y,z|  |  <head(a)  > , tail(a) ) . 

Once  these  have  been  proven,  the  verification  problem  then 
transforms  to 

{a-NULL} 

while  y  ft  NULL  do 

x  :■  x  |  |  <head(y)>;  y  :■  tail(y) 

Si? 

while  z  ft  NULL  do 

x  :»  x  |  |  <head(z)>;  z  ;■  tail(z) 

{x»F (xO,yO,zO,aO) } . 

Here  T  corresponds  to  the  last  loop  and  again  K  is  the  identity 
function,  CLOSURE  holds  and  hence  K  itself  is  a  reduction 
hypothesis.  The  proof  of  this  program  thus  consists  of  showing 
ITERATION",  i.e. 

a*NULL,  yjrfNULL  ->  F  (x, y,  z, a) «F  (xl  |  chead (y)  > ,  tail(y) , z, a) 
and  then  verifying 
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{  a- NULL  &  y»NULL} 
while  z  +  NOLL  do 

x  :»  x  ||  <head(z)>;  z  :■  tail(z) 
od 

{x*F (xO,yO,zO,aO) } . 

Again  using  the  identity  function  as  a  reduction  hypothesis,  this 
remaining  verification  problem  can  be  proved  by  showing  ITERA¬ 
TION"  and  BOUNDARY  * ,  i.e. 
a»NULL,  y-NULL,  z/NOLL  -> 

F(x,y,z,a)»F(x|  |  <head(z)  >  ,y,  tail(z)  ,a) 

and 

a®NULL,  y=NULL,  z«NULL  ->  x*F (x, y, z, a) . 

We  remark  that  none  of  the  program-oriented  reduction 
hypothesis  techniques  defined  in  Theorem  3.2  assume  any  knowledge 
of  the  characteristics  of  T.  Thus  these  techniques  can  be  used 
for  creating  a  reduction  hypothesis  in  the  circumstance  where  the 
input/output  behavior  of  this  subprogram  is  not  known.  The 
specification-oriented  heuristic  of  Theorem  3.3,  however,  cannot 
be  employed  without  this  knowledge. 

In  Section  2.4,  it  was  explained  that  the  ease  with  which  a 
composite  program  is  verified  depends  largely  on  the  strategy 
used  to  decompose  the  program  and  prove  its  correctness  based  on 
analysis  of  its  components.  While  it  is  difficult  to  define  an 
"optimal"  decomposition  strategy,  we  offer  our  view  that  the 
notion  of  a  sequence  of  proof  transformations  presented  in  this 
section  seems  to  represent  a  quite  effective  decomposition  tech¬ 
nique  in  practice.  The  proof  proceeds  in  a  "top  down"  fashion, 
decomposing  the  original  verification  problem  into  new,  simpler 
problems.  By  way  of  contrast,  the  functional  approach  based  on 
prime  program  decomposition  is  a  more  "bottom  up"  technique;  each 
loop  is  analyzed  and  verified  individually,  without  consideration 
of  how  the  loop  fits  into  the  program  as  a  whole.  As  a  result,  a 
functional  proof  based  on  prime  program  decomposition  of  the  pro¬ 
gram  is  Example  3.7  would  have  required  knowing  the  intended 
function  of  each  of  the  three  WHILE  loops.  This  information  is 
unnecessary  when  the  program  is  verified  by  the  sequence  of  proof 
transformations. 

3.4.  Discussion 


From  a  practical  point  of  view,  it  is  difficult  to  carefully 
assess  the  relative  merits  of  the  proposed  program-verification 
methodology.  Preferences  by  people  involved  in  verifying  pro¬ 
grams  are  often  based  on  which  methodology  appears  to  be  more 
"natural"  or  "intuitive"  in  a  given  application.  Furthermore, 
answers  to  questions  such  as  these  no  doubt  are  largely  influ¬ 
enced  by  the  way  a  person  was  trained  in  the  field  of  software 
engineering. 
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Despite  this  caveat,  we  offer  our  view  that  in  a  number  of 
cases,  it  seems  "easier”  to  create  a  reduction  hypothesis  than  it 
does  to  create  an  adequate  inductive  assertion  or  tail  function 
for  a  proof  by  the  standard  techniques  discussed  in  Section  3.2. 
Indeed,  it  is  difficult  to  argue  the  reverse  case  in  light  of  the 
results  of  that  section,  which  state  that  any  adequate  inductive 
assertion  is  a  reduction  hypothesis,  and  that  any  tail  function 
can  be  simply  transformed  to  one.  On  the  other  hand,  our  feeling 
is  that  the  verification  conditions  which  result  from  the  use  of 
a  reduction  hypothesis  seem  somewhat  more  complicated  than  their 
counterparts  in  either  of  these  standard  techniques.  This  is 
largely  due  to  the  necessity  of  having  three  distinct  data  states 
appearing  in  condition  ITERATION"  and  two  distinct  data  states 
appearing  in  condition  BOUNDARY".  Both  of  these  verification 
conditions,  however,  are  usually  easily  simplified  in  practice. 

In  light  of  these  comments,  an  interesting  direction  for 
future  research  is  the  translation  of  heuristics  such  as  those 
defined  in  Theorems  3.2  and  3.3  into  the  framework  of  standard 
correctness  techniques.  For  example,  what,  if  any,  is  the  impact 
of  these  correctness/incorrectness  preserving  transformations  on 
the  synthesis  of  an  inductive  assertion  or  tail  function  for  the 
program  under  consideration?  Some  results  along  this  line  are 
presented  in  Chapter  4.  In  that  chapter,  heuristic  SHORTCUT  is 
applied  to  the  problem  of  synthesizing  tail  functions  for  ini* 
tialized  loop  programs  and  meets  with  a  fair  degree  of  success. 

The  solution  of  any  complex  problem  is  often  best  deconposed 
into  solutions  of  appropriate  simpler  problems.  This  is  an 
important  principle  on  which  our  verification  strategy  is  based. 
A  search  for  a  reduction  hypothesis  is  really  a  search  for  a 
suitable  simpler  problem  to  substitute  for  the  original.  As  dis¬ 
cussed  in  Section  3.3,  proving  condition  ITERATION"  for  this 
simpler  problem  decomposes  the  solution  of  the  new  problem  into 
the  solution  of  a  still  simpler  problem  and  so  on. 
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4.  A  Heuristic  For  Deriving  Loop  Functions 

In  this  chapter,  we  will  consider  programs  of  the  following 

form: 

< INITIALIZATION  STATEMENTS> 
while  <LOOP  PREDICATE>  do 
<  LOOP  BODY  STATEMENTS > 
od. 

These  programs  tend  to  occur  frequently  in  programming  in  order 
to  accomplish  some  specific  task,  e.g.  sort  a  table,  traverse  a 
data  structure,  calculate  some  arithmetic  function,  etc.  More 
precisely,  the  intended  purpose  of  such  a  program  is  often  to 
compute,  in  some  particular  output  variable(s),  a  specific  func¬ 
tion  of  the  program  inputs.  In  this  chapter,  we  address  the 
problem  of  analyzing  a  program  of  the  above  form  in  order  to 
prove  its  correctness  relative  to  this  intended  function. 

One  common  strategy  taken  to  solve  this  problem  is  to  heu- 
ristically  synthesize  a  sufficiently  strong  inductive  assertion 
(i.e.  loop  invariant  [Hoare  69])  for  proving  the  correctness  of 
the  program.  A  large  number  of  techniques  to  aid  in  the  discovery 
of  these  assertions  have  appeared  in  the  literature  (see,  for 
example,  [Wegbreit  74,  Katz  6  Manna  76]).  It  is  our  view,  how¬ 
ever,  that  these  techniques  seem  to  be  more  "machine  oriented" 
than  "people  oriented."  Uiat  is,  they  seem  geared  toward  use  in 
an  assertion  generator  for  an  automatic  program  verification  sys¬ 
tem.  Furthermore,  a  sizable  portion  of  the  complexity  of  these 
techniques  is  due  to  their  general-purpose  nature.  The  methodol¬ 
ogy  proposed  here  is  intended  to  be  used  by  programmers  in  the 
process  of  reading  (i.e.  understanding,  documenting,  verifying, 
etc.)  programs  and  is  tailored  to  the  commonly  occurring  verifi¬ 
cation  problem  discussed  above. 

An  alternative  to  the  inductive  assertion  approach  which  is 
addressed  in  this  chapter  is  to  invent  an  hypothesis  concerning 
the  intended  function  (i.e.  general  input/output  behavior)  of 
the  WHILE  loop.  Once  this  has  been  done,  the  loop  can  be 
proven/d isproven  correct  with  respect  to  the  hypothesis  using  the 
functional  technique  described  in  Chapter  2  [Mills  72,  Mills  75, 
Basu  &  Misra  75,  Morris  &  Wegbreit  77,  wegbreit  77,  Misra  78] . 
If  the  hypothesis  is  shown  to  be  valid,  the 
correctness/incorrectness  of  the  program  in  question  follows 
immediately.  It  has  been  shown  (Basu  &  Misra  76,  Misra  78,  Misra 
79,  Basu  80]  that  this  loop  hypothesis  can  be  generated  in  a 
deterministic  manner  (i.e.  one  that  is  guaranteed  to  succeed)  for 
two  restricted  classes  of  programs.  The  approach  suggested  here 
is  similar  to  this  method  in  that  the  same  type  of  loop  behavior 
seems  to  be  exploited  in  order  to  obtain  the  hypothesis.  Our 
approach  is  not  deterministic  in  general,  but  as  a  result,  is 
intended  to  be  more  widely  applicable  and  easier  to  use  than 
those  previously  proposed  in  the  literature. 
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One  view  of  the  problem  of  discovering  the  general 
input/output  behavior  of  the  WHILE  loop  under  consideration  might 
be  to  study  it  and  make  a  guess  about  what  it  does.  One  might  go 
about  doing  this  by  "executing"  the  loop  by  hand  on  several  sam¬ 
ple  inputs  and  then  guessing  some  general  expression  for  the 
input/output  behavior  of  the  loop  based  on  these  results.  Deci¬ 
sions  that  need  to  be  made  when  using  such  a  technique  include 
how  many  sample  inputs  to  use,  how  should  these  inputs  be 
selected,  and  how  should  the  general  expression  be  inferred. 
Another  consideration  is  that  hand  execution  can  be  a  difficult 
and  an  error  prone  task.  Indeed,  it  seems  that  the  loops  for 
which  hand  execution  can  be  carried  out  in  a  straightforward 
manner  are  the  ones  that  are  least  in  need  of  verification  or 
some  other  type  of  formal  analysis. 

Our  methodology  is  similar  to  this  technique  in  that  we 
attempt  to  infer  the  general  behavior  of  the  loop  from  several 
sample  loop  behaviors.  In  contrast  to  this  technique,  however, 
the  sample  behaviors  are  not  obtained  from  hand  execution,  rather 
they  are  obtained  from  the  specification  for  the  initialized  loop 
program.  In  many  of  the  cases  we  have  studied,  the  general 
behavior  of  the  loop  in  question  is  quite  easy  to  guess  from 
these  samples.  This  is  not  to  say  that  the  loop  computes  a  "sim¬ 
ple"  function  of  its  inputs  or  that  the  loop  necessarily  operates 
in  a  "simple"  manner.  Much  more  accurately,  the  ease  with  which 
the  general  behavior  can  be  inferred  from  the  samples  is  due  to  a 
"simple"  connection  between  a  change  in  the  input  value  of  an 
initialized  variable  and  the  corresponding  change  caused  in  the 
result  produced  by  the  loop.  We  will  expand  on  this  idea  in  what 
follows. 

4.1.  The  Technique 

In  order  to  describe  the  proposed  technique,  we  represent 
the  verification  problem  discussed  above  as  follows: 

{X  e  D(  f) } 

X  :■  K (X) ? 
while  B (X)  do 
X  :■  H  (X) 
od 

{v-f (XO)} . 

In  this  notation,  X  represents  the  data  state  of  the  program.  K 
and  H  are  data-state-to-data-state  functions  corresponding  to  the 
effects  of  the  initialization  and  loop  body  respectively.  B  is  a 
predicate  over  the  data  state.  The  program  is  specified  to  pro¬ 
duce  in  the  variable  v  of  X  a  function  f  of  the  input  data  state 
XO. 

If  D  is  the  set  of  all  possible  program  data  states  and  T  is 
the  set  of  values  that  the  variable  v  may  assume,  the  specifica¬ 
tion  function  f  has  the  functionality  f  :  D  ->  T.  In  order  to 
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verify  a  program  of  this  form,  we  choose  to  find  a  function  g  :  D 
->  T  which  describes  the  input/output  characteristics  of  the 
WHILE  loop  over  a  suitably  general  input  domain.  Specifically, 
this  input  domain  must  be  large  enough  to  contain  all  the  inter¬ 
mediate  data  states  generated  as  the  loop  iterates.  If  this  is 
the  case,  the  loop  is  said  to  be  closed  (see  Section  2.1)  for  the 
domain  of  g. 

we  briefly  consider  two  alternative  approaches  to  synthesiz¬ 
ing  this  loop  function  g.  The  alternatives  correspond  to  the 
"top  down"  and  "bottom  up"  approaches  to  creating  inductive 
assertions  discussed  in  [Katz  &  Manna  73,  Ellozy  81]  .  In  the 
"top  down"  alternative,  the  hypothesis  g  answers  the  question 
"what  would  the  general  behavior  of  the  loop  have  to  be  in  order 
for  the  program  to  be  correct?"  If  such  an  hypothesis  can  be 
found  and  verified,  the  correctness  of  the  program  is  esta¬ 
blished.  If  the  program  is  incorrect,  no  such  valid  hypothesis 
exists.  In  the  "bottom  up"  alternative,  the  hypothesis  g  answers 
the  question  "what  is  the  general  behavior  of  the  loop?"  In  this 
case,  a  valid  hypothesis  always  exists.  Once  it  has  been  found 
and  verified,  the  program  is  correct  if  and  only  if  the  initiali¬ 
zation  followed  by  g  is  equivalent  to  the  function  f. 

The  advantage  of  a  "top  down"  approach  is  that  it  is  usually 
easier  to  apply  in  practice  because  the  verifier  has  more  infor¬ 
mation  to  work  with  when  synthesizing  the  hypothesis.  The  disad¬ 
vantage  of  such  an  approach  is  that  it  may  not  be  as  well-suited 
to  disproving  the  correctness  of  programs.  This  is  because  to 
disprove  a  program,  the  verifier  must  employ  an  argument  which 
shows  that  there  does  not  exist  a  valid  hypothesis.  The  method 
described  in  this  chapter  is  based  on  the  "top  down"  approach. 
We  will  return  to  a  discussion  of  this  advantage  and  disadvantage 
later. 

We  begin  by  assuming  the  program  in  question  is  correct  with 
respect  to  its  specification.  We  then  consider  several  properties 
of  the  function  g  which  result  from  this  assumption.  First,  the 
correctness  of  the  program  implies 

(4.1)  XO  6  D(f)  ->  f (X0)»g(K(X0) ) . 

That  is,  for  inputs  satisfying  the  program  precondition,  the  ini¬ 
tialization  followed  by  the  loop  yields  the  desired  result. 
Secondly,  since  the  loop  computes  g, 

B(X0)  ->  g(X0)-g(H (XO) ) 

holds  by  the  "iteration  condition"  [Misra  78]  of  the  standard 
technique  for  showing  the  loop  computes  g.  This  implies 
B ( K ( XO ) )  ->  g(K(X0) ) «g(H (K(X0) ) ) . 

Combining  with  (4.1)  yields 

(4.2)  XO  e  D(f),  B (K(X0) )  ->  f (XO)-g(H(K(XO) ) ) . 

At  this  point  we  choose  to  introduce  an  additional  universally 
quantified  state  variable  X  into  each  of  (4.1)  and  (4.2),  result¬ 
ing  in  the  equivalent  conditions 

(4.1')  XO  e  D( f) ,  X-K(XO)  ->  g(X) «f (XO) 


(4.2")  XO  6  D(f),  B  (K(XO) )  ,  X*H(K(XO))  ->  g(X)-f(XO). 

We  summarize  by  saying  that  if  the  program  is  correct  with 
respect  to  its  specification,  conditions  (4.1")  and  (4.2")  hold. 

Suppose  now  that  the  specification  (f),  and  the  input/output 
behavior  of  the  initialization  (R) ,  loop  predicate  (B)  and  loop 
body  (H)  are  known.  Given  this,  (4.1")  and  (4.2")  can  be  used  to 
sol for  the  loop  hypothesis  g  on  a  certain  set  of  inputs  assum¬ 
ing  the  correctness  of  the  program.  Indeed,  (4.1")  and  (4.2") 
can  be  thought  of  as  defining  portions  of  the  unknown  loop  func¬ 
tion  g  we  are  seeking.  Specifically,  each  of  (4.1")  and  (4.2") 
can  be  viewed  as  defining  a  function  g  with  a  restricted  domain. 
In  this  light,  for  example,  (4.1")  defines  the  function  (i.e.  set 
of  ordered  pairs) 

g  =  {(x,z)  |  *  xo  e  D(f)  )-  (x«k(xo)  &  z=f(xo))}. 

We  call  (4.1")  and  (4.2")  constraint  functions  since  they  serve 
as  constraints  (i.e.  requirements)  on  the  general  loop  function. 
More  precisely  put,  the  constraint  functions  are  subsets  of  the 
general  loop  function.  The  hope  is  that  if  these  subsets  are 
representative  of  the  whole,  the  general  loop  function  may  be 
inferred  through  analysis  of  the  constraint  functions. 

In  what  follows  we  describe  a  four  step  process  for  con¬ 
structing  a  general  loop  function  g  from  these  constraint  func¬ 
tions.  we  suggest  that  the  reader  not  be  taken  aback  by  what  may 
appear  to  be  considerable  complexity  in  the  description  of  our 
technique.  We  intentionally  have  attempted  to  describe  the  pro¬ 
cedure  in  a  careful,  precise  manner.  Furthermore,  the  technique 
is  based  on  a  few  simple  ideas  and,  once  those  ideas  have  been 
learned,  we  feel  it  can  be  applied  with  a  considerable  amount  of 
success. 

Example  4.1  -  As  we  describe  these  steps,  we  will  illustrate 
their  application  on  the  following  trivial  program  to  compute 
multiplication: 

{v>«0} 

z  :*  0; 

while  v  ft  0  do 
z  :*  z  +  TcT 
v  :■  v  -  1 
od 

{z*v0*k} . 

We  now  proceed  with  a  description  of  these  steps. 

Step  1  :  RECORD  -  The  first  step  consists  of  recording  the 
constraint'- functions  (copied  from  (4.1")  and  (4.2")) 

Cl:  XO  e  D(f),  X-K(XO)  ->  g(X)-f(X0) 
and 

C2:  XO  e  D(f),  B (K(X0) ) ,  X-H(K(X0))  ->  g(X)-f(X0). 

As  a  notational  convenience,  we  dispense  with  the  data-state 
notation  and  use  program  variables  (possibly  subscripted  by  0  to 


denote  their  initial  values)  in  these  function  definitions.  The 
terms  XO  6  D(f)  and  f(X0)  come  from  the  pre-  and  post-condition 
for  the  initialized  loop  respectively.  The  term  X«K(XO)  is  based 
on  the  input/output  behavior  of  the  initialization,  and  the  terms 
B ( K ( XO ) )  and  X*H(K(X0))  together  describe  the  input/output 
behavior  of  the  initialization  followed  by  exactly  one  loop 
iteration.  We  illustrate  these  ideas  with  the  multiplication 
program  in  Example  4.1.  The  constraint  functions  for  this  pro¬ 
gram  are  as  follows: 

Cl:  v0>«0,  v*vO,  z»0  ->  g(z,v,k)»vO*k 
C2:  v0>0,  v»vO-l,  z«k  ->  g( z, v, k) «vO*k. 

we  make  the  following  comments  concerning  these  function  defini¬ 
tions.  First,  in  the  interest  of  simplicity,  we  do  not  RBCORD 
the  "effect"  of  the  initialization  or  loop  body  on  the  constant  k 
(i.e.  we  dispense  with  k«kO  and  the  need  for  a  symbol  kO) . 
Secondly,  g  is  defined  as  a  function  of  each  program  variable 
which  occurs  in  the  loop  predicate  or  loop  body.  That  is,  g  is  a 
function  of  the  variables  on  which  the  behavior  of  the  loop 
directly  depends.  Furthermore,  note  that  in  C2,  the  term  v0>0 
captures  both  XO  6  D(f)  (i.e.  v0>*0)  and  B(K(X))  (i.e.  v0/0) .  As 
a  final  remark,  in  a  constraint  function  we  will  use  the  phrase 
domain  requirement  to  refer  to  the  collection  of  terms  to  the 
left  of  the  "->"  symbol  and  function  expression  to  refer  to  the 
expression  which  defines  the  value  of  g  (e.g.  v6*k  in  both  Cl  and 
C2  above)  . 

Step  2  :  SIMPLIFY  -  All  variables  which  appear  in  the  func¬ 
tion  definition  but  not  in  the  argument  list  for  g  must  eventu¬ 
ally  be  eliminated  from  the  definition.  On  occasion,  it  is  pos¬ 
sible  to  solve  for  the  value  of  such  a  variable  in  the  domain 
requirement  and  substitute  the  equivalent  expression  for  it 
throughout  th*  definition.  To  illustrate,  in  the  definition  Cl 
above,  vO  is  u  candidate  for  elimination.  We  know  its  value  as  a 
function  of  v  (i.e.  vO»v)  ,  hence  we  can  SIMPLIFY  this  definition 
to 

Cl:  v>»0,  z-0  *■>  g(z,  v,  k) -v*k. 

Note  that  the  term  v«vO  has  disappeared  since  with  the  substitu¬ 
tion  it  is  equivalent  to  TRUE.  In  a  similar  manner,  the  second 
constraint  function  can  be  SIMPLIFIED  to  (using  vO»v+l) 

C2:  v>»0,  z«k  ->  g(z, v, k) ■ (v+l)*k. 

Although  applying  this  simplifying  heuristic  is  most  often  a 
straightforward  process,  care  must  be  taken  to  insure  that  the 
domain  of  the  constraint  function  is  not  mistakenly  extended. 
For'  example,  if  d  and  dO  are  integer  variables,  the  definition 
d0>0,  d«d0*2  ->  g(d)-d0*8 
does  not  SIMPLIFY  to 
d>0  ->  g(d)»d*4 

since  the  first  function  defines  a  value  of  g  only  for  positive, 
even  values  of  d  while  the  second  definition  defines  a  value  of  g 
for  all  positive  d.  The  first  function  does  SIMPLIFY  to 
d>0,  EVEN (d)  ->  g(d) «d*4 

where  EVEN (d)  is  a  predicate  which  is  TRUE  iff  d  is  even. 
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Step  3  :  REWRITE  -  Variables  which  appear  in  the  argument 
list  for  g  but  not  in  the  function  expression  of  its  definition 
are  candidates  to  be  introduced  into  the  function  expression. 
Each  of  these  variables  will  be  bound  to  a  term  in  the  domain 
requirement  of  the  definition.  The  purpose  of  this  step  is  to 
rewrite  the  function  expression  of  C2  (based  on  the  properties  of 
the  operation(s)  involved)  in  order  to  introduce  these  terms  into 
the  function  expression.  To  illustrate,  consider  the  above  SIM¬ 
PLIFIED  C2  definition.  The  variable  z  is  a  candidate  to  be 
introduced  into  the  function  expression  (v+l)*k.  It  is  bound  to 
the  term  k  in  the  domain  requirement.  Thus  we  need  to  introduce 
an  additional  term  k  into  this  function  expression.  One  way  to 
do  this  is  to  translate  the  expression  to  v*k+k.  Based  on  this, 
we  REWRITE  C2  as 

C2:  v>«0,  z»k  ->  g(z, v, k) «v*k+k. 

Step  4  :  SUBSTITUTE  -  In  steps  2  and  3,  the  constraint  func¬ 
tions  are  massaged  into  equivalent  definitions  in  order  to  facil¬ 
itate  step  4.  The  purpose  of  this  step  is  to  attempt  to  infer  a 
general  loop  function  from  these  constraints.  We  motivate  the 
process  as  follows.  Suppose  we  are  searching  for  a  particular 
relationship  between  several  quantities,  say  E,  m  and  c.  Furth¬ 
ermore,  suppose  that  through  some  form  of  analysis  we  have  deter¬ 
mined  that  when  m  has  the  value  17,  the  relationship  E«17*  (c**2) 
holds.  A  reasonable  guess,  then,  for  a  general  relationship 
between  E,  m  and  c  would  be  E"m*(c**2).  This  would  be  particu¬ 
larly  true  if  we  had  reason  to  suspect  that  there  was  a  rela¬ 
tively  simple  connection  between  the  quantities  m  and  E.  we 
arrived  at  the  general  relationship  by  substituting  the  quantity 
m  for  17  in  the  relationship  which  is  known  to  hold  when  m  has 
the  value  17.  Viewed  in  this  light,  the  purpose  of  the  con¬ 
straint  function  C2  is  to  obtain  a  relationship  which  holds  for  a 
specific  value  of  m  (e.g.  17).  The  step  REWRITE  exposes  the  term 
17  in  this  relationship.  Finally,  SUBSTITUTE  substitutes  m  for 
17  in  the  relationship  and  proposes  the  result  as  a  general  rela¬ 
tionship  between  E,  m  and  c.  In  terms  of  the  multiplication  pro¬ 
gram  being  considered,  the  SUBSTITUTE  step  calls  for  replacing 
one  of  the  terms  k  in  the  above  rewritten  function  expression 
with  the  term  z.  The  two  possible  substitutions  lead  to  the  fol¬ 
lowing  general  functions: 

v>»0  ->  g(z, v, k) *v*k+z 

and 

v>»0  ->  g(z, v, k) -v*z+k. 

Both  of  these  (necessarily)  are  generalizations  (i.e.  supersets) 
of  C2,  however,  only  the  first  is  also  a  generalization  of  Cl. 
Hence  this  function  is  hypothesized  as  a  description  of  the  gen¬ 
eral  behavior  of  the  above  WHILE  loop. 

We  have  applied  the  above  4  steps  to  obtain  an  hypothesis 
for  the  behavior  of  the  loop  in  question.  Since  this  description 
is  sufficiently  general  (specifically,  since  the  loop  is  closed 
for  the  domain  of  the  function)  ,  we  can  prove/disprove  the 
correctness  of  the  hypothesis  using  standard  verification 
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techniques  [Mills  75,  Misra  78]  .  Specifically,  the  hypothesis  is 
valid  if  and  only  if  each  of 

-  the  loop  terminates  for  all  v>»0, 

-  v-0  ->  z«z  +  v*k,  and 

-  z  +  v*k  is  a  loop  constant  (i.e.  vO*kO-z  +  v*k  is  a 

loop  invariant) 

hold.  We  remark  that  the  loop  hypothesis  is  selected  in  such  a 
way  that  if  it  holds  (i.e.  the  loop  does  compute  this  general 
function),  the  initialized  loop  is  necessarily  correct  with 
respect  to  f. 

We  emphasize  that  there  are  usually  an  infinite  number  of 
generalizations  of  the  constraint  functions  Cl  and  C2,  and  that, 
depending  on  how  REWRITE  and  SUBSTITUTE  are  applied,  the  tech¬ 
nique  is  capable  of  generating  any  one  of  these  generalizations. 
For  example,  REWRITE  and  SUBSTITUTE  applied  to  the  multiplication 
example  could  have  produced 
C2:  v>«0,  z«k  ->  g(z,v,k)> 

v*k  +  3*k  +  k*k*  (v-7)/(4*k)  +  k*k*k/(k*k) 

-  k*k*k*  (v-7)/(4*k*k)  -  k*k*k*3/(k*k) 

and 

v>«0  ->  g(z,v,k)» 

v*k  +  3*z  +  z*z*(v-7)/(4*k)  +  2*z*z/(k*k) 

-  z*z*z*(v-7)/(4*k*k)  -  z*z*z*3/(k*k) 
respectively,  where  "/"  denotes  an  integer  division  (with  trunca¬ 
tion)  infix  operator  which  yields  0  when  its  denominator  is  0. 
This  last  function  is  also  a  generalization  of  Cl  and  C2. 

It  has  been  our  experience,  however,  that  many  initialized 
loops  occur  in  which  there  exists  some  relatively  simple  connec¬ 
tion  between  different  input  values  of  the  variables  constrained 
by  initialization  and  the  corresponding  result  produced  by  the 
WHILE  loop.  Most  often  in  practice,  these  variables  are  bound  to 
values  in  the  domain  requirement  of  C2  which  suggest  an  applica¬ 
tion  of  REWRITE  that  uncovers  this  relationship  and  leads  to  a 
correct  hypothesis  concerning  the  general  loop  behavior.  In  the 
following  section  we  illustrate  a  number  of  example  applications 
of  this  technique. 

jl  .2.  Applications 

Example  4.2  -  The  following  program  computes  integer 

exponentiation?  ""This  example  serves  to  illustrate  the  use  of  the 
technique  when  the  loop  body  contains  several  paths: 

{d>-0} 

w:«l; 

while  d  M  do 

if  odd(d)  then  w  :•  w  *  c  fi* 
c  :•  c*c;  d  :■  d/2 
od 

{ w«cff  A  dO } . 
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The  infix  operator  *  appearing  in  the  postcondition  represents 
integer  exponentiation.  The  first  constraint  function  is  easily 
RECORDED: 

d0>»0,  c»cO,  d«dO,  w«l  ->  g(w#c,d)»cO'vdO 
and  SIMPLIFIES  to 
Cl:  d>«0,  w»l  ->  g(w,c,d) »cAd. 

Since  there  exist  two  paths  through  the  loop  body,  we  will  obtain 
two  second  constraint  functions.  The  first  of  these  deals  with 
the  path  which  updates  the  value  of  w  and  is  executed  when  the 
input  value  of  d  is  odd.  The  function  is 
d0>0,  odd(dO),  w«cO,  c«cO*cO,  d«d0/2  ->  g(w,c,d)»cO*dO 
which  SIMPLIFIES  to 

C2a:  d>«0,  c«w*w  ->  g(w,c, d)  *wA  (d*2+l) . 

The  function  corresponding  to  the  other  loop-body  path  is 
d0>0,  "odd(dO),  w*l,  c«cO*cO,  d«d0/2  ->  g(w,c,d)*cO''dO 
and  SIMPLIFIES  to 

d>*0,  w*l,  SQUARE  (C)  ->  g(w,c,d)«SQRT(c)  A  (d*2) 

i.e. 

C2b:  d>*0,  w*l,  SQUARE(c)  ->  g(w,c, d)  acAd 
where  SQUARE(x)  is  a  predicate  which  is  TRUE  iff  x  is  a  perfect 
square  and  SQRT(x)  is  the  square  root  of  the  perfect  square  x. 
The  term  SQUARE(x)  is  necessary  in  the  domain  requirement  since 
the  unSIMPLIFIED  function  is  only  defined  for  values  of  c  which 
are  perfect  squares.  Note  that  C2b  is  a  subset  of  Cl  and  hence 
is  of  no  additional  help  in  characterizing  the  general  loop  func¬ 
tion.  The  heuristic  suggested  in  REWRITE  is  to  rewrite  the  func¬ 
tion  expression  w~  (d*2+l)  of  C2a  in  terms  of  w,  w*w  (so  as  to 
introduce  c)  and  d.  The  peculiar  nature  of  the  exponent  in  this 
expression  leads  one  to  the  equivalent  formula  w*  ( (w*w)  ~d) . 
Applying  SUBSTITUTE  in  C2a  yields 
d>»0  ->  g(w,c,d) »w* (c*d) . 

This  function  is  in  agreement  with  (i.e.  is  a  superset  of)  Cl  and 
thus  is  a  reasonable  hypothesis  for  the  general  loop  function. 

In  this  example,  the  portion  of  C2  corresponding  to  the 
loop-body  path  which  bypasses  the  updating  of  the  initialized 
data  is  a  subset  of  Cl.  Based  on  this,  one  might  conclude  that 
such  loop-body  paths  should  be  ignored  when  constructing  C2.  Con¬ 
sidering  all  loop-body  paths,  however,  does  increase  the  likeli¬ 
hood  that  an  incorrect  program  could  be  disproved  (at  the  time 
the  general  loop  function  is  being  constructed)  by  observing  an 
inconsistency  between  constraint  functions  Cl  and  C2.  For 
instance,  in  the  example,  if  the  assignment  to  c  had  been  written 
"c:«c*2”,  the  above  analysis  would  have  detected  an  inconsistency 
in  the  constraints  on  the  general  loop  function.  Such  an  incon¬ 
sistency  implies  that  the  hypothesis  being  sought  for  the  general 
behavior  of  the  loop  does  not  exist,  and  hence,  that  the  program 
is.  not  correct  with  respect  to  its  specification. 

In  the  previous  section,  the  reader  may  recall  that  awkward¬ 
ness  in  disproving  programs  was  offered  as  a  disadvantage  of  a 
"top  down”  approach  to  synthesizing  g.  However,  it  has  been  our 
experience  that,  as  in  the  above  instance,  an  error  in  the 
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program  being  considered  often  manifests  itself  as  an  incon¬ 
sistency  between  Cl  and  C2.  Such  an  inconsistency  is  usually 
"easy"  to  detect  and  hence  the  program  is  "easy”  to  disprove. 
While  it  is  difficult  to  give  a  precise  characterization  of  when 
this  will  occur,  intuitively,  it  will  be  the  case  provided  that 
the  "error”  (e.g.  c*2  for  c*c)  can  be  "executed”  on  the  first 

iteration  of  the  loop. 

Example  4.3  -  The  following  program  counts  the  number  of 
nodes  in  a  nonempty  binary  tree  using  a  set  variable  s.  It 
differs  from  the  previous  example  in  that  more  than  one  variable 
is  initialized.  The  tree  variable  t  is  the  input  tree  whose 
nodes  are  to  be  counted.  We  use  the  notation  left(t)  and 
right(t)  for  the  left  and  right  subtrees  of  t  respectively.  The 
predicate  empty (t)  is  TRUE  iff  t  is  the  empty  tree  (i.e.  contains 
0  nodes)  . 

{ -empty  (t)} 
n  :*  0;  s  :*  { t} j 
while  s  j*  { }  do 

select  and  remove  some  element  e  from  s; 
n  :*  n  +  1; 

if  “empty(  left(e))  then  s  :■  s  U  (left  (e)J  fij 
if  “empty  (right (e) )  then  s  s  U  {right(e)}  fi 
o5~ 

{n-NODES(t) } 


The  notation  NODES (t)  appearing  in  the  postcondition  stands  for 
the  number  of  nodes  in  binary  tree  t.  The  first  constraint  func¬ 
tion  is 

Cl:  ~empty(t),  n»Q,  s*{  .  }  ->  g(n,  s)  -NODES (t) . 

Rather  than  considering  each  of  the  four  possible  paths  through 
the  loop  body  individually,  we  abstract  the  combined  effect  of 
the  two  IF  statements  as  the  assignment 
s  s  U  SONS(e)  , 

where  SONS(x)  is  the  set  of  0,  1  or  2  nonempty  subtrees  of  x. 
Applying  this,  the  second  constraint  function  is 
C2:  “erapty(t),  n»l,  s-SONS(t)  ->  g(n, s) -NODES (t)  . 
we  choose  to  REWRITE  the  function  expression  for  C2  using  the 
recursive  definition  that  NODES (x)  for  a  nonempty  tree  x  is  1 
plus  the  NODES  value  of  each  of  the  0,  1  or  2  nonempty  subtrees 
of  x.  Specifically,  this  would  be 
1+SUM ( X ,  SONS  ( t)  , NODES  (X) ) 

where  SUM(A,B,C)  stands  for  the  summation  of  C  over  all  A  6  B. 
Applying  SUBSTITUTE  in  the  obvious  way  yields 
"empty(t)  ->  g(n,  s)*n+SUM(x,s, NODES  (x) ) 
which  is  in  agreement  with  Cl  and  is  thus  a  reasonable  guess  for 
the  general  loop  function  g. 

Two  remarks  are  in  order  concerning  this  example.  The  first 
deals  with  the  condition  "empty (t)  appearing  in  the  domain 
requirement  of  the  obtained  function.  The  reader  may  wonder,  if 
t  is  not  referenced  in  the  loop  (it  is  not  in  the  argument  list 
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for  g)  ,  how  can  the  loop  behavior  depend  cm  empty (t)?  The  answer 
is  that  it  obviously  cannot;  the  above  function  is  simply 
equivalent  to 

g  (n,  s)  »n+SUM(x,  s, NODES  (x) ) . 

For  the  remainder  of  the  examples  of  this  section,  we  assume  that 
these  unnecessary  conditions  are  removed  from  the  domain  require¬ 
ment  of  the  constraint  function  as  part  of  the  SUBSTITUTE  step. 

As  a  second  point,  in  Example  4.3  we  encounter  the  case 
where  the  obtained  function  is,  strictly  speaking,  too  general, 
in  that  its  domain  includes  "unusual”  inputs  for  which  the 
behavior  of  the  loop  does  not  agree  with  the  function.  For 
instance,  in  the  example,  the  loop  computes  the  function 
g  (n,s)  *n+SUM(x,  s, NODES (x) ) 

only  under  the  provision  that  the  set  s  does  not  contain  the 
empty  tree.  This  is  normally  not  a  serious  problem  in  practice. 
One  proceeds  as  before,  i.e.  attempts  to  push  through  a  proof  of 
correctness  using  the  inferred  function.  If  the  proof  is  suc¬ 
cessful,  the  program  has  been  verified;  otherwise,  the  charac¬ 
teristics  of  the  input  data  which  cause  the  verification 
condition(s)  to  fail  (e.g.  s  contains  an  empty  tree)  suggest  an 
appropriate  restriction  of  the  input  domain  (e.g.  s  contains  only 
nonempty  trees)  and  the  program  can  then  be  verified  using  this 
new,  restricted  function. 

Example  4.4  [Gries  79]  -  Ackermann's  function  A(m,n)  can  be 
defined  as  follows  for  all  natural  numbers  m  and  n: 

A(0,n)  *  n+1 

A (m+1,0)  ■  A (m, 1) 

A(m+l,n+l)  *  A (m,A  (m+l,n) )  . 

The  following  program  computes  Ackermann's  function  using  a 
sequence  variable  s  of  natural  numbers.  The  notation  s(l)  is  the 
rightmost  element  of  s  and  s(2)  is  the  second  rightmost,  etc. 
The  sequence  s(..3)  is  s  with  s(2)  and  s(l)  removed.  Mb  will  use 
<  and  >  to  construct  sequences,  i.e.  a  sequence  8  consisting  of  n 
elements  will  be  written  <s(n),  ...  ,s(2),s(l)>. 

{m>*0,n>«0} 
s  :■  <m,n>; 
while  size(s)  ^  1  do 

if  s(2)  »  0  then  s:*s(..3) 
elseif  s(l)»0  then  ss«s(..3) 
else  s:-s(..3) 

{s«<X(m,n)  >} 

For  this  program,  the  first  constraint  function  is 
Cl:  m>»0,  n>»0,  s*<m,n>  ->  g(s)  ■<A(m,n)  >  . 

The  second  constraint  functions  corresponding  to  the  3  paths 
through  the  loop  body  are 

C2a:  ra«0,  n>«0,  s«<n+l>  ->  g(s)><A(m,n)> 


<s(l)+l> 

<s (2) -1,1> 

<8  (2)  -1  ,s(2) , s(l)  -1>  fi 


C2b:  m>0,  n  »0,  s«<m-l,l>  ->  g(s)«<A(m,n)  > 

C2c:  m>0,  n  >0,  s»<m-l ,m, n-l>  ->  g(s)«<A(m, n)  >  . 

REWRITING  these  3  based  on  the  above  definition  of  A  yields 
m»0,  n>«0,  s*<n+l>  ->  g(s)*<n+l> 

m>0,  n  -0,  s«<m-l,l>  ->  g(s)-<A(m-l,l)> 

m>0,  n  >0,  s»<m-l,m,n-l>  ->  g(s) -<A(m-l ,A (m, n-1) ) > . 
SUBSTITUTING  here  yields 

s*<s(l)>  ->  g(s)*<s(l)> 

s=><s  (2) , s(l) >  ->  g(s)«<A(s(2) ,  s(l)  )> 

s»<s (3) ,  s(2) , s(l) >  ->  g( s) »<A(s(3) ,A(s(2),s(l)))>. 

Note  that  the  second  of  these  functions  implies  Cl.  The  3  seem 
to  suggest  the  general  loop  behavior  (where  n>l) 
g( <s (n) , s(n-l) ,  ...  ,s(l)>)  = 

<A(s(n) /A (s(n-l) ,  ...  A(s(2),s(l))  ...  ))>. 

We  remark  that  in  the  first  3  examples,  the  heuristic 
resulted  in  a  loop  function  which  was  sufficiently  general  (i.e. 
the  loop  was  closed  for  the  domain  of  the  inferred  function) . 
Example  4.4  illustrates  that  this  does  not  always  occur.  The 
loop  function  heuristic  is  helpful  in  the  example  in  that  it  sug¬ 
gests  a  behavior  of  the  loop  for  general  sequences  of  length  1,  2 
and  3.  Based  on  these  results,  verifier  is  left  to  infer  a 
behavior  for  a  sequence  of  arbitrary  length. 

Example  ^.5  -  Let  v  be  a  one  dimensional  array  of  length  n>0 
which  contains  natural  numbers.  The  following  program  finds  the 
maximum  element  in  the  array: 

m  :*  0;  i  1; 

while  i  <■  n  do 

'  If  m  <  v ( iT~ then  m  : ■  v [ i ]  fi; 
i  :■  i  +  1 
fi 

{m-BIGGEST  (v)  } 

The  notation  BIGGEST (v)  appearing  in  the  postcondition  stands  for 
the  largest  element  of  v.  The  following  constraint  functions  are 
obtained 

Cl:  m«0,  i*l  ->  g(m,  i,v,n)  -BIGGEST  (v) 

C2:  m«v(l]  ,  i«2  ->  g(m,  i,v,n) -BIGGEST (v)  . 

Noticing  the  appearance  of  v[l]  and  2  in  C2,  we  REWRITE 
BIGGEST (v)  in  C2  as  MAX(v[l]  ,BIGGEST(v[2..n] ) ) ,  where  MAX  returns 
the  largest  of  its  two  arguments,  and  v[2..n]  is  a  notation  for 
the  subarray  of  v  within  the  indicated  bounds.  The  generaliza¬ 
tion  which  suggests  itself, 

g(m,  i,  v,n)  -MAX  (m, BIGGEST  (v[  i.  .n] ) ) , 
agrees  with  Cl. 

Example  4.6  -  If  p  is  a  pointer  to  a  node  in  a  binary  tree, 
let  POSf  (p)  Be” the  sequence  of  pointers  which  point  to  the  nodes 
in  a  postorder  traversal  of  the  binary  tree  pointed  to  by  p.  The 
following  program  constructs  POST(p)  in  a  sequence  variable  vs 
using  a  stack  variable  stk.  we  use  the  notation  l(p)  and  r(p) 
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for  the  pointers  to  the  left  and  right  subtrees  of  the  tree 
pointed  to  by  p.  If  p  has  the  value  NIL,  POST(p)  is  the  enpty 
sequence.  The  variable  rt  points  to  the  root  of  the  input  tree 
to  be  traversed. 

p  :*  rt;  stk  :-  EMPTY;  vs  :-  <>; 
while  “ (p-NIL  &  stk-EMPTY)  do 
if  p/NIL  then 

stk  <»  p  7*  push  p  onto  stk  */  ; 
p  l(p) 
else 

p  <«  stk  /*  pop  stk  */  ; 
vs  :*  vs  |  |  <p>; 
p  :»  r  (p)  fi 
od 

{vs  »  POST  (rt) } . 

Up  until  now,  we  have  attempted  to  infer  a  general  loop  function 
from  two  constraint  functions.  Of  course,  there  is  nothing  spe¬ 
cial  about  the  number  two.  In  this  exanple,  the  "connection" 
between  the  initialized  variables  and  the  function  values  is  not 
clear  from  the  first  two  constraint  functions  and  it  proves  help¬ 
ful  to  obtain  a  third  constraint  function.  Functions  Cl  and  C2 
correspond  to  0  and  1  loop-body  executions,  respectively.  The 
third  constraint  function  C3  will  correspond  to  2  loop-body  exe¬ 
cutions.  We  will  use  the  notation  (el,  ...  ,en)  for  a  stack  con¬ 
taining  the  elements  el,  ...  ,en  from  top  to  bottom.  The  con¬ 
straint  functions  for  this  program  are 

Cl:  p«rt,  stk-EMPTY,  vs-<>  -> 

g  ( p,  stk ,  vs)  -POST  ( r  t) 

C2s  rt/NIL,  p-l(rt),  stk-(rt),  vs*<>  -> 

g  ( p,  stk ,  vs)  -POST  ( r  t) 

C3a:  rt/NIL,  l(rt)^NIL,  p-l(l(rt)),  stk-  (1( rt)  ,  rt)  ,  vs-<>  -> 
g(p,  stk,  vs)  -POST(rt) 

C3b:  r t/NIL,  l(rt)«NIL,  p-r(rt),  stk-EMPTY,  vs»<rt>  -> 
g(p,stk,  vs)  -POST  (rt)  . 

Note  that  there  are  two  third  constraint  functions.  C3a  and  C3b 
correspond  to  executions  of  the  first  and  second  loop -body  paths 
(on  the  second  iteration),  respectively.  There  is  only  1  second 
constraint  function  since  only  the  first  loop-body  path  can  be 
executed  on  the  first  iteration.  Using  the  recursive  definition 
of  POST,  we  REWRITE  C2,  C3a  and  C3b  as  follows: 

C2":  rt/NIL,  p-l(rt)  ,  stk-(rt),  vs-<>  -> 

g(p,stk,  vs)  -POST(l(rt) )  |j<rt>||  POST(r(rt)) 

C3a" :  rt/NIL,  l(rt)^NIL,  p-l(l(rt)),  stk- (l(rt) , rt) ,  vs«<>  -> 
g(p,  stk,  vs)  -POST  (1(  1(  rt) ) )  [l<l(rt)>||  POST  (r(  1(  rt) ) ) 

1 1  <r t>  1 1  POST  (r(  rt) ) 

C3b“ :  rt^NIL,  l(rt)-NIL,  p-r(rt),  stk-EMPTY,  vs-<rt>  -> 
g(p, stk, vs) -<rt>  ||  POST(r(rt)). 

Applying  SUBSTITUTE  to  each  of  C2',  C3a'  and  C3b'  suggests 
stk- (el),  vs-<>  ->  g(p,stk,vs)-POST(p)  <el>  POST (r (el)) 

stk-(el,e2),  vs-<>  ->  g(p,stk, vs)-POST (p)  <el>  POST(r(el)) 

<e2>  POST (r (e2) ) 
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stk»EMPTY  ->  g(p,  stk,  vs)  >vs|  |POST  (p) 

respectively.  The  first  2  of  these  functions  imply  the  following 
behavior  for  an  arbitrary  stack  where  vs  has  the  value  <>: 
stk«(el,  en)  ,  vs«<>  ->  g(p,stk,vs)  ■ 

POST  (p)  ||  ( <el>  |  |  POST  (el)  |  | ...  |  |  <en>  |  |  POST(en)) 
and  in  combination  with  the  last  function,  the  general  behavior 
stk«(el,  . ..,  en)  ->  g(p,stk,vs)  ■ 

vs  ||  POST  (p)  ||  ( <el>  |  |  POST  (el)  |  | ...  |  |  <en>  |  |  POST(en)) 
is  suggested. 

In  this  section  we  have  illustrated  the  use  of  our  technique 
on  a  number  of  example  programs.  The  reader  has  seen  that  the 
success  of  the  method  hinges  largely  on  the  way  REWRITE  is  per¬ 
formed.  What  guidelines  can  be  used  in  deciding  how  to  apply 
this  step?  The  general  rule  given  above  is  to  identify  the  vari¬ 
ables  that  need  to  be  introduced  into  the  expression  and  then  to 
rewrite  the  expression  using  the  terms  to  which  these  variables 
are  bound.  For  instance  in  Example  4.3,  NODES (t)  was  rewritten 
using  the  terms  1  and  SONS(t).  Beyond  this  rule,  however,  the 
reader  may  have  noticed  an  additional  similarity  in  the  way 
REWRITE  was  applied  in  these  examples.  If  f  is  the  function  or 
operation  the  initialized  loop  program  is  intended  to  compute, 
each  REWRITE  step  involved  decomposing  an  application  of  f  in 
some  way.  In  Example  4.1,  for  instance,  a  multiplication  opera¬ 
tion  was  decomposed  into  an  addition  and  multiplication  opera¬ 
tion;  in  Exanple  4.3,  a  NODES  operation  was  deconposed  into  a 
summation  and  a  number  of  NODES  operations;  in  Exanple  4.5,  a 
BIGGEST  operation  was  decomposed  into  a  MAX  and  a  BIGGEST  opera¬ 
tion.  In  Section  4.5  we  will  characterize  this  idea  of  decompos¬ 
ing  the  intended  operation  of  the  initialized  loop  program  and 
discuss  several  implications  of  the  characterization  for  the  pro¬ 
posed  technique. 

In  Example  4.6,  we  saw  that  the  technique  generalizes  to  the 
use  of  three  (and  indeed  an  arbitrary  number  of)  constraint  func¬ 
tions.  We  have  seen  that  each  of  these  functions  defines  a  sub¬ 
set  of  the  general  loop  function  g  being  sought.  If  the  con¬ 
straint  functions  themselves  are  sufficiently  general,  it  may  be 
that  the  first  several  of  these  functions,  taken  collectively, 
constitute  a  complete  description  of  g.  We  consider  this  situa¬ 
tion  in  the  following  section. 

4.3.  Complete  Constraints 

The  technique  described  above  for  obtaining  a  general  loop 
function  is  "nondeterministic”  in  that  the  constraint  functions 
do  not  precisely  identify  the  desired  function;  rather  they  serve 
as  a  formal  basis  from  which  intelligent  guesses  can  be  made  con¬ 
cerning  the  general  behavior  of  the  loop.  Our  belief  is  that  it 
is  often  easy  for  a  human  being  to  fill  in  the  remaining  "pieces" 
of  the  loop  function  "picture"  once  this  basis  has  been  esta¬ 
blished. 
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There  exist,  however,  circumstances  when  the  constraints  do 
constitute  a  complete  description  of  an  adequate  loop  function. 
Specifically,  this  description  may  be  complete  through  the  use  of 
one,  two  or  more  of  the  constraint  functions.  The  significance 
of  these  situations  is  that  no  guessing  or  "filling  in  the  pic¬ 
ture"  is  necessary;  the  program  can  be  proven/d isproven  correct 
using  the  constraints  as  the  general  loop  function.  In  this  sec¬ 
tion  we  give  a  formal  characterization  of  this  circumstance. 

Definition  ^.1  -  For  some  N  >  0,  an  initialized  loop  is  N- 
closed  with  respect  to  its  specification  f  iff  the  union  of  the 
constraint  functions  C1,C2,  ...  ,CN  is  a  function  g  such  that  the 
loop  is  closed  for  the  domain  of  g.  In  this  case,  the  con¬ 
straints  C1,C2,  ...  CN  are  complete . 

Thus  if  a  loop  is  N-closed  for  some  N>0,  the  union  of  the 
first  N  constraint  functions  constitutes  an  adequate  loop  func¬ 
tion  for  the  loop  under  consideration.  Intuitively,  the  value  N 
is  a  measure  of  how  quickly  (in  terms  of  the  number  of  loop 
iterations)  the  variables  constrained  by  initialization  take  on 
"general"  values. 

Example  4.7  -  The  following  program 

{b>-0} 
a  :3  a  +  1; 
while  b  >  0  do 
a  :*  a  +T; 
b  :»b-l 
od 

{ a*  a(T'+  bO  +  1} 

is  1-closed  since  the  first  constraint  function  is 
Cl:  b0>«0,  a*a0+l,  b«b0  ->  g(a,b)*a0+b0+l 
which  SIMPLIFIES  to 

b>»0  ->  g(a,b)«a+b 

and  the  loop  is  closed  for  the  domain  of  this  function.  Thus  Cl 
by  itself  defines  an  adequate  loop  function. 

Initialized  loops  which  are  1-closed  seem  to  occur  rarely  in 
practice.  Somewhat  more  frequently,  an  initialized  loop  will  be 
2-closed.  For  these  programs,  the  loop  function  synthesis  tech¬ 
nique  described  above  (using  2  constraint  functions)  is  deter¬ 
ministic. 

Example  4.8a  -  Consider  the  program 
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sum  :*  0; 

while  seq  j*  EMPTY  do 

sum  :*  sum  +  hea3(seq)  j 
seq  :*  tail(seq) 
od 

{sum-SIGMA(seqO) } . 

The  notation  SIGMA(seqO)  appearing  in  the  postcondition  stands 
for  the  sum  of  the  elements  in  the  sequence  seqO.  The  program  is 
2 -closed  since  the  second  constraint  function  is 
C2:  seqOjf EMPTY ,  sum* head  ( seqO)  ,  seq*ta i  1  (  seqO)  -> 

g  ( sum,  seq)  *SIGMA  ( seqO ) 

which  SIMPLIFIES  to 

g  ( sum,  seq)  *sum+SIGMA  ( seq)  . 

The  loop  is  trivially  closed  for  the  domain  of  this  function. 

Example  4.8b  -  As  a  second  illustration  of  a  2-closed  ini¬ 
tialized  loop,  the  following  program  tests  whether  a  particular 
key  appears  in  an  ordered  binary  tree. 


success  :*  FALSE; 

while  tree  f  NULL  &  "success  do 

if  name (tree)  *  key  then  success  r*  TRUE 
elseif  name  (tree)  <  key  then  tree  :*  right(tree) 
else  tree  :*  left  (tree)  fi 

53—  — 

{success  *  IN (key, treeQ) } 


The  notation  IN  (key,  treeO)  is  a  predicate  idhich  is  true  iff  key 
occurs  in  ordered  binary  tree  treeO.  This  program  is  also  2- 
closed.  Note  that  the  first  constraint  function 
Cl:  success*F ALSE ,  tree* treeO  -> 

g( success, tree, key) * IN (key, treeO) 

SIMPLIFIES  to 

success*FALSE  ->  g( success, tree, key) *IN (key, tree) . 

If  we  consider  the  first  path  through  the  loop  body,  the  second 
constraint  function  is 

C2:  success-TRUE,  treeO^NIL,  tree*treeO,  key* name ( tree)  -> 

g( success, tree, key) *IN( key , treeO) 

which  SIMPLIFIES  to 


8ucceas*TRUE,  tree^NIL,  key-name ( tree)  -> 

g(  success,  tree,  key)  *IN (key,  tree) . 

Although  the  domain  of  the  union  of  these  two  functions  is  some¬ 
what  restricted,  i.e. 

{ <success, tree , key>  | 

(("success)  OR  (tree^NIL  &  key*name(  tree) ) ) } , 
the  loop  is  nevertheless  closed  for  this  domain  and  hence  the 
initialized  loop  is  2-closed. 


Example  4.8c  -  Consider  the  sequence  of  initialized  loops 
Pl,P2,P3  ...  "defined  as  follows  for  each  I>0: 
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PI  :  {x>»0} 

x  s»  x  *  I; 
while  x  >  0  do 

x  s»  x  -  rr 

y  :■  y  +  k 
od 

{y»yC  +  xO*I *k}  . 


For  any  I>0,  the  £irst  I  constraint  functions  for  program  PI  are 
Cl:  x0>*0,  x»xO*I,  y»yO  ->  g(z,y,  k)=yO+xO*I*k 

C2:  xO>*l,  x«xO*I-l,  y»yO+k  ->  g  ( z,  y,  k)  »yO+xO*I*k 


• 

Cl:  xO>*I-l  ,x»xO*I- ( I— 1 ) ,  y=yO+k*  (1-1)  ->  g(x,  y,  k)  =yO+xO*I*k. 
These  SIMPLIFY  to 

x>=0,  MI  (x)  ->  g(x,  y,  k) -y+x*k 

x>»0,  MI  (x+1)  ->  g(x,y,  k)  »y+x*k 


• 

x>»0,  MI(x+(I-l))  ->  g(z,y,k)=y+x*k 
where  MI  is  a  predicate  which  is  TRUE  iff  its  argument  is  a  mul¬ 
tiple  of  I.  Since  the  union  of  these  is  the  function 
x>*0  ->  g(x, y,  k)  *y+x*k, 

and  the  loop  is  closed  for  the  domain  of  this  function,  we  con¬ 
clude  that  for  each  I>0,  program  PI  is  I-clo;* 

For  many  initialized  loops  which  seem  to  occur  in  practice, 
however,  there  does  not  exist  an  N  such  that  they  are  N-closed 
with  respect  to  their  specifications.  This  means  that  no  finite 
number  of  constraint  functions  will  pinpoint  the  appropriate  gen¬ 
eralization  exactly;  i.e.  when  applying  the  above  technique  in 
these  situations,  some  amount  of  inferring  or  guessing  will 
always  be  necessary.  A  case  in  point  is  the  integer  multiplica¬ 
tion  program  from  Example  4.1.  The  constraint  functions 
C1,C2,C3,  ...  define  the  general  loop  behavior  for  z*0,  z«k, 
z»2*k,  ...  etc.  The  program  cannot  be  N-closed  for  any  N  since 

with  input  v*N+l,  the  last  value  of  z  will  be  (N+l)*k  which  is 
not  in  the  domain  of  any  of  these  constraint  functions. 

As  a  final  comment  concerning  N-closed  initialized  loops,  it 
may  be  instructive  to  consider  the  following  intuitive  view  of 
these  programs.  All  1-closed  and  2-closed  initialized  loops 
share  the  characteristic  that  they  are  "forgetful,”  i.e.  they 
soon  lose  track  of  how  "long"  they  have  been  executing  and  lack 
the  necessary  data  to  recover  this  information.  This  is  due  to 
the  fact  that  intermediate  data  states  which  occur  after  an  arbi¬ 
trary  number  of  iterations  are  indistinguishable  from  data  states 
which  occur  after  zero  (or  one)  loop  iterations.  To  illustrate, 
consider  the  2-closed  initialized  loop  of  Example  4.8a  which  sums 
the  elements  contained  in  a  sequence.  After  some  arbitrary 
number  of  iterations  in  an  execution  of  this  program,  suppose  we 


-59- 


stop  it  and  inspect  the  values  of  the  program  variables  sum  and 
seq.  Based  on  these  values,  what  can  we  tell  about  the  history 
of  the  execution?  The  answer  is  not  too  much;  about  all  we  can 
say  is  that  if  sum  is  not  zero  then  we  know  we  have  previously 
executed  at  least  1  loop  iteration,  but  the  exact  number  of  these 
iterations  may  be  1,  10  or  10000. 

By  way  of  contrast,  again  consider  the  integer  multiplica¬ 
tion  program  of  Example  4.1,  an  initialized  loop  we  know  not  to 
be  N-closed  for  any  N.  Suppose  we  stop  the  program  after  an 
arbitrary  number  of  iterations  in  its  execution.  Based  on  the 
values  of  the  program  variables  z,  v  and  k,  what  can  we  tell 
about  the  history  of  the  execution?  This  information  tells  us  a 
great  deal;  for  example,  we  know  the  loop  has  iterated  exactly 
z/k  times  and  we  can  reconstruct  each  previous  value  of  the  vari¬ 
able  z. 

Initialized  loops  which  have  the  information  available  to 
reconstruct  their  past  have  the  potential  to  behave  in  a  "tricky" 
manner.  By  "tricky"  here,  we  mean  performing  in  such  a  way  that 
depends  unexpectedly  on  the  history  of  the  execution  of  the  loop 
(i.e.  chi  the  effect  achieved  by  previous  loop  iterations).  The 
result  of  this  loop  behavior  would  be  a  loop  function  which  was 
"inconsistent"  across  all  values  of  the  loop  inputs  and  which 
could  only  be  inferred  from  the  constraint  functions  with  consid¬ 
erable  difficulty.  We  consider  this  phenomenon  more  carefully  in 
the  following  section;  for  now  we  emphasize  that  it  is  precisely 
the  potential  to  behave  in  this  unpleasant  manner  that  is  lacking 
in  1-closed  and  2-closed  initialized  loops  and  which  allows  their 
general  behavior  to  be  described  completely  by  the  first  one  or 
two  constraint  functions. 

4.4.  *Tr joky*  Programs 

The  above  heuristic  suggests  inferring  g  from  two  subsets  of 
that  function.  Cl  and  C2.  Constraint  function  C2  is  of  particu¬ 
lar  importance  since  REWRITE  and  SUBSTITUTE  are  applied  to  this 
function  and  it,  consequently,  serves  to  guide  the  generalization 
process.  C2  is  based  on  the  program  specification  f,  the  ini¬ 
tialization  and  the  input/output  behavior  of  the  loop  body  on  its 
first  execution.  In  any  problem  of  inferring  data  concerning 
some  population  b*sed  on  samples  from  that  population,  the  accu¬ 
racy  of  the  results  depends  largely  on  how  representative  the 
samples  are  of  the  population  as  a  whole.  The  degree  to  which 
the  sample  defined  in  C2  is  is  representative  of  the  unknown  gen¬ 
eral  function  we  are  seeking  depends  entirely  on  how  representa¬ 
tive  the  input/output  behavior  of  the  loop  body  on  the  first  loop 
iteration  is  of  the  input/output  behavior  of  the  loop  body  on  an 
arbitrary  subsequent  loop  iteration. 


To  give  the  reader  the  general  idea  of  what  we  have  in  mind, 
consider  the  program  to  count  the  nodes  in  a  binary  tree  in  Exam¬ 
ple  4.3.  If  the  loop  body  did  something  peculiar  when,  for 


ex  amp  le ,  the  set  s  contained  two  nodes  with  the  same  parent  node, 
or  when  n  had  the  value  IS,  the  behavior  of  the  loop  body  on  its 
first  execution  would  not  be  representative  of  its  general 
behavior.  By  "peculiar"  here,  we  mean  something  that  would  not 
have  been  anticipated  based  solely  on  input/output  observations 
of  its  initial  execution.  An  application  of  our  heuristic  on 
programs  of  this  nature  would  almost  certainly  fail  since 
(apparently)  vital  information  would  be  missing  from  Cl  and  C2. 

Example  4.9  -  Consider  applying  the  technique  to  the  follow¬ 
ing  program  wHich  is  an  alternative  implementation  of  the  integer 
multiplication  program  presented  in  Example  4.1: 

{v>®0} 

z  :*  0; 

while  v  M  do 
if_  z*0  then 
else if  z*k  then 
else 

v  :=  v  -  1 
od 

{z-vff^k}  . 

The  constraint  functions  Cl  and  C2  are  identical  to  those  for  the 
program  in  Example  4.1  and  we  have  no  reason  to  infer  a  different 
function  g.  Yet  this  function  is  not  only  an  incorrect 
hypothesis,  it  does  not  even  come  close  to  describing  the  general 
behavior  of  the  loop.  The  difficulty  is  that  the  behavior  of  the 
loop  body  on  its  first  execution  is  in  no  way  typical  of  its  gen¬ 
eral  behavior.  This  is  due  to  the  high  dependence  of  the  loop- 
body  behavior  on  the  input  value  of  the  initialized  variable  z. 

We  make  the  following  remarks  concerning  programs  of  this 
nature.  Fi.'rst,  our  experience  indicates  that  they  occur  very 
rarely  in  practice.  Secondly,  because  they  tend  to  be  quite  dif¬ 
ficult  to  analyze  and  understand,  we  consider  them  "tricky"  or 
poorly  structured  programs.  Thirdly,  the  question  of  whether  the 
( inp ut /output)  behavior  of  the  loop  body  on  the  first  iteration 
is  representative  of  its  behavior  on  an  arbitrary  subsequent 
iteration  is  really  a  question  of  whether  its  behavior  when  the 
initialized  variables  have  their  initial  values  is  representative 
of  its  behavior  when  the  initialized  variables  have  "arbitrary” 
values.  Put  still  another  way,  the  question  is  whether  the  loop 
body  behaves  in  a  "uniform"  manner  across  the  spectrum  of  possi¬ 
ble  values  of  the  initialized  data. 

In  practice,  a  consequence  of  a  loop  body  exhibiting  this 
uniform  behavior  is  that  there  exists  a  simply  expressed  connec¬ 
tion  between  different  input  values  of  the  initialized  data  and 
the  corresponding  result  produced  by  the  WHILE  loop.  It  is  the 
existence  of  such  a  connection  which  motivates  the  SUBSTITUTE 
step  above  and  which  is  thus  a  necessary  precondition  for  a  suc¬ 
cess  fu  1  application  of  the  technique.  This  explains  its  failure 


z  :*  k 

z  :*  z  *  2  *  v 
z  :■  z  -  k  fi; 
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in  dealing  with  prog  cans  such  as  that  in  Example  4.9.  Me  make  no 
further  mention  of  these  "tricky"  programs,  and  in  the  following 
section  discuss  an  informal  categorization  of  "reasonable"  pro¬ 
grams  and  consider  its  implications  for  our  loop  function  syn¬ 
thesis  technique. 

4.5.  BO  and  TO  Loops 

In  this  section,  we  discuss  general  characteristics  of  many 
commonly  occurring  iterative  programs.  These  characteristics  are 
used  to  suggest  two  categories  of  these  programs.  This  categori¬ 
zation  is  of  interest  since  the  above  heuristic  for  synthesizing 
loop  functions  is  particularly  useful  when  applied  to  initialized 
loops  in  one  of  these  categories. 

In  solving  any  particular  problem,  it  often  makes  sense  to 
consider  certain  instances  of  the  problem  as  being  "easier"  or 
"harder"  to  solve  than  other  instances.  For  example,  with  the 
problem  of  sorting  a  table,  the  ease  with  vftiich  the  sort  can  be 
performed  may  depend  on  the  size  of  the  table,  i.e.  an  instance 
of  the  problem  for  a  table  containing  N  elements  might  be  harder 
to  solve  than  an  instance  of  the  problem  for  a  table  containing 
N-l  elements.  Similarly,  if  the  problem  is  multiplying  natural 
numbers,  a*b  might  be  easier  to  solve  than  (a+l)*b.  This  notion 
of  "easier"  and  "harder"  instances  of  a  problem  is  particularly 
apparent  for  problems  with  natural  recursive  solutions.  These 
solutions  solve  conplex  instances  in  terms  of  less  coaplex 
instances  and  hence  support  the  idea  of  one  problem  instance 
being  easier  to  solve  than  another. 

For  the  purpose  of  this  discussion,  we  divide  the  data  modi¬ 
fied  by  the  initialized  loop  under  consideration  into  two  sec¬ 
tions:  the  accumulating  data  and  the  control  data.  The  accumu¬ 
lating  data  is  the  specified  output  variable(s)  of  the  loop.  The 
remaining  modified  data  is  the  control  data  and  often  serves  to 
"guide"  the  execution  of  the  loop  and  determine  the  point  at 
which  the  loop  should  terminate.  Both  the  accumulating  data  and 
the  control  data  are  typically  (but  not  always)  constrained  by 
initialization  in  front  of  the  loop. 

Example  4.10a  -  In  the  program 
{n>«0} 

z  :■  1;  t  :■  0; 
while  t  n  do 

- t~ t  +  TT 

z  :*  z  *  t 
od 
{z»nl} 

the  variable  z  is  the  specified  output  of  the  loop  and  is  hence 
the  accumulating  data.  The  other  modified  variable,  t,  is  used 
to  control  the  termination  of  the  loop  and  is  the  control  data. 


In  many  cases,  the  control  data  can  be  viewed  as  represent¬ 
ing  an  instance  (or  perhaps  several  instances)  of  the  problem 
being  solved.  As  the  loop  executes  and  the  control  data  changes, 
the  control  data  represents  different  instances  of  this  problem. 
To  illustrate,  we  can  think  of  the  control  data  t  in  the  previous 
example  as  a  variable  describing  a  particular  instance  of  the 
factorial  problem.  As  the  loop  executes,  the  variable  t  takes  on 
the  values  0,  1,  ...,  n,  and  these  values  can  be  thought  to 

correspond  to  the  problems  Ot,  It,  ...  nl. 

Based  on  these  informal  observations,  we  characterize  a  BU 
(from  the  Bottom  Upward)  loop  as  one  where  the  control  data  prob¬ 
lem  instances  are  generated  in  order  of  increasing  complexity, 
beginning  with  a  simple  instance  and  ending  with  the  input  prob¬ 
lem  instance  to  be  solved.  In  the  execution  of  a  BU  loop,  the 
control  data  can  be  viewed  as  representing  the  "work*  that  has 
been  accomplished  "so  far.."  We  consider  the  factorial  program 
above  to  be  a  BU  loop.  At  any  point  in  time,  the  "work"  so  far 
accomplished  is  t!  and  t  moves  from  0  (a  simple  factorial 
instance)  to  n  (the  input  factorial  instance). 

Conversely,  we  characterize  a  TO  (from  the  Top  Downward) 
loop  as  one  «fcere  the  control  data  problem  instances  are  gen¬ 
erated  in  order  of  decreasing  complexity,  beginning  with  the 
input  problem  instance  and  ending  with  a  simple  problem  instance. 
In  the  execution  of  a  TO  loop,  the  control  data  can  be  viewed  as 
representing  the  "work"  that  remains  to  be  done . 

Example  4.10b  -  We  consider  the  following  alternative  imple¬ 
ments  tTonof  factorial  to  be  a  TO  loops 

{n>»0} 

z  :■  1;  t  :■  n? 

while  t  M  do 
z  :*  z  *  t; 
t  :■  t  -  1 


As  before  z  and  t  are  the  accumulating  and  control  data  respec¬ 
tively.  The  variable  t  moves  from  n  (the  input  factorial 
instance)  and  ends  with  0  (a  simple  factorial  instance).  After 
any  iteration,  the  product  n*(n-l)*  ...  *(t+l)  has  been  accumu¬ 
lated,  leaving  ti  as  the  "work"  that  remains  to  be  done. 

Example  4.11  -  As  an  additional  illustration,  consider  the 
following  three  initialized  loops  which  compute  integer  exponen¬ 
tiation: 
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A: 


{ y >  -  0 } 
w:*l?  t:«0? 
while  t/y  do 
w  :■  w*x? 
t  :«  t+1 


od 


{w-iTy} 


B: 


{y>-0} 
w:»l;  t:-y? 
while  t/*0  do 
w  :■  w*x? 
t  t-1 
od 


{w»x*y} 


C:  {y>“0} 

W!»l;  c:-x?  t:»y? 
while  t^O  do 

IT  odd(t)  then 
w  :*  w*c  £i? 
c:»c*c?  t:*t75 
od 

{w-jTy} 


As  before,  the  symbol  A  is  used  as  an  infix  exponentiation  opera¬ 
tor.  Me  consider  program  A  to  be  a  BU  loop.  The  control  data  t 
moves  from  0  to  y  and  corresponds  to  the  problem  instances  xA0, 

...,  xAy.  On  the  other  hand,  B  is  TD  since  the  control  data  t 

moves  from  y  to  0  and  corresponds  to  the  problem  instances  xAy, 

...,  xA0.  Program  C  (similar  to  that  in  Example  4.2)  is  slightly 

more  difficult  to  analyze.  The  control  data  is  the  pair  <c,t>. 
The  pair  is  initialized  to  <x,y>  and  ends  with  the  value  <c",0>, 
where  c"  is  some  complex  function  of  x  and  y.  It  seems  reason¬ 
able  to  consider  <c,t>  as  representing  the  problem  cAt.  Hence  we 
conclude  C  is  also  TD.  This  conclusion  also  makes  sense  in  light 
of  the  fact  that  C  is  really  an  optimized  version  of  B  which 
saves  iterations  by  exploiting  the  binary  decomposition  of  y. 


The  characterization  of  BU  and  TD  loops  described  here  is, 
of  course,  an  informal  one  and  depends  largely  on  one 's  interpre¬ 
tation  of  the  meaning  or  purpose  of  the  control  data,  we  classi¬ 
fied  the  above  programs  by  using  what  we  considered  to  be  the 
most  "natural"  or  intuitive  interpretation?  other  interpretations 
are  always  possible.  Occasionally,  two  different  interpretations 
of  the  control  data  seem  equally  valid  and  hence  the  program  may 
be  considered  as  either  BO  or  TD,  depending  on  one's  point  of 
view.  For  example,  consider  the  following  program  which  adds  up 
the  elements  in  a  subarray  between  indices  pi  and  p2: 


sum  0?  i  s*  pi? 
while  t  <*  p2  do 

sum  :»  sum  +  a[ i) ? 
i  i  +  1 
od 

{sum»ASUM(a[pl..p2] )} . 

The  notation  ASUM( a[pl .  .p2] )  appearing  in  the  postcondition 
stands  for  the  summation  of  the  elements  in  the  indicated  subar¬ 
ray.  The  question  which  arises  in  attempting  to  classify  this 
program  is  as  follows:  as  the  control  data  i  moves  through  the 
values  pi,  pl+1,  ...,  p2 ,  is  it  most  appropriate  to  think  of  it 
as  representing  the  problem  instance  which  has  been  solved  (i.e. 
ASUM(a[pl . .  i] ) )  or  as. representing  the  problem  instance  which 
remains  to  be  solved  (i.e.  ASUM(  a[  i.  .p2] ) ) .  Both  views  seem 
equally  intuitive,  that  is,  the  program  seems  to  be  as  much  BU  as 
it  is  ID. 
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As  a  final  exanple,  we  refer  back  to  the  program  in  Example 
4.3  which  counts  the  nodes  in  a  binary  tree.  It  is  clear  n  and 
the  set  variable  s  are  the  accumulating  and  control  data  respec¬ 
tively.  Initially,  s  contains  the  tree  whose  nodes  are  to  be 
counted;  when  the  program  terminates  s  is  empty.  In  between,  s 
contains  various  subtrees  of  the  original  tree.  It  seems  natural 
to  view  the  set  as  containing  progressively  simpler  and  simpler 
instances  of  the  NODES  problem  since  the  trees  in  s  consist  of 
fewer  and  fewer  nodes  as  the  loop  executes.  Thus  we  classify  the 
program  as  a  ID  loop. 

We  have  seen  that  the  problem-solving  method  taken  by  a  BO 
loop  is  one  of  approaching  the  general  problem  instance  from  some 
simple  problem  instance.  Of  course,  this  problem-solving  method 
is  reasonable  only  when  there  exists  some  technique  whereby  one 
is  guaranteed  to  "run  into"  the  general  problem  instance.  Our 
view  is  that  in  many  cases,  such  a  convergence  technique  either 
does  not  exist  or  requires  so  much  support  that  the  BU  approach 
is  not  practical.  This  appears  to  be  particularly  true  for  pro¬ 
grams  dealing  with  sophisticated  data  types  (i.e.  something  other 
than  integers)  and  for  programs  requiring  a  high  degree  of  effi¬ 
ciency  in  their  number  of  iterations. 

To  help  see  this  point  of  view,  again  consider  the  NODES 
program  of  Example  4.3.  Previously  we  argued  that  this  was  a  TD 
program.  What  would  a  BU  program  which  computed  the  same  func¬ 
tion  look  like?  The  following  program  skeleton  suggests  itself: 

n  :*  0;  tl  :■  "an  empty  tree"; 

while  tl  y  t  do 

"add  a  node  to  tl  to  make  it  look  more  like  t"; 
n  :■  n  1 
od 

{ n*NODES  ( t)  }  . 

Here,  the  tree  variable  tl  is  the  control  data  and  it  represents 
the  problem  NODES(tl).  The  difficulty  with  this  attempt  at  a 
program  solution  is  the  implementation  of  the  modification  of  tl. 
Such  a  modification  requires  close  inspection  (i.e.  a  traversal) 
of  t  in  order  to  move  tl  toward  t.  In  light  of  this,  it  seems 
more  reasonable  to  count  the  nodes  of  t  while  it  is  being 
inspected  and  to  dispense  altogether  with  the  variable  tl. 

As  an  illustration  of  another  circumstance  where  the  BU 
approach  seems  unreasonable,  the  reader  is  encouraged  to  imagine 
a  BU  implementation  of  integer  exponentiation  which  operates  as 
efficiently  as  the  exponentiation  program  C  from  Example  4.11. 
Again,  a  program  skeleton  suggests  itself: 
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{y>«0} 

w  :»  1;  c  :»  ?;  d  :■  0; 
while  <c,d>  ft  <x,y>  do 
c  :■  aqrt(c)  ; 
if  ?  then 

d  !■  d  *  2  +  1}  w  :•  w  *  c 
else  d  :■  d  *  2  fi 
5T~"  ~ 

{w-jp'y} . 

Here,  we  are  attempting  to  move  the  control  data  <c,d>  toward 
<x,y>  as  fast  as  we  moved  it  away  from  <x,y>  in  TO  program  C.  As 

with  the  BU  NODES  program,  the  problem  here  is  how  to  complete 

the  program  so  as  to  achieve  the  desired  effect.  Our  conclusion 
concerning  this  program  is  that  supplying  an  appropriate  initial 
value  for  c  and  determining  the  proper  loop-body  path  to  be  exe¬ 
cuted  requires  such  complexity  that  this  approach  is  not  a  feasi¬ 
ble  alternative  to  program  C. 

In  this  section  we  have  suggested  two  informal  categories  of 
initialized  loop  programs.  We  offered  the  opinion  that  the 
approach  taken  in  a  BU  program  solution  has  rather  limited  appli¬ 
cability  and  that  TO  programs  tend  to  occur  more  frequently  in 

practice.  We  feel  that  this  characterization  is  useful  as  a 

study  of  opposing  problem-solving  philosophies  but  our  main 
source  of  motivation  is  to  investigate  the  kinds  of  commonly 
occurring  programs  on  which  the  loop  function  synthesis  technique 
described  above  works  well. 

Consider  applying  this  technique  to  a  general  TO  program. 
In  the  second  constraint  function,  the  control  data  is  bound  to  a 
value  vAiich  represents  a  slightly  less  complex  instance  of  the 
general  problem  being  solved  by  the  initialized  loop.  In  prac¬ 
tice,  the  appearance  of  this  value  in  the  constraint  function 
suggests  the  problem  decomposition  being  exploited  by  the  pro- 
g rammer  in  order  to  achieve  the  program  result.  Applying  this 
decomposition  in  REWRITE  leads  quite  naturally  to  the  desired 
general  loop  function. 

Example  4.12  -  Consider  the  TO  factorial  program  from  Exam¬ 
ple  4 Jlbb.  T^e  second  constraint  function  is 
C2s  n>0,  z»n,  t»n-l  ->  g(z,t)«nl 
The  control  data  t  being  bound  to  n-1  suggests  REWRITTIN8  n!  as 
n*(n-l)i.  This  leads  to  the  correct  general  loop  function.  On 
the  other  hand,  consider  the  second  constraint  function  for  the 
BU  factorial  program  from  Example  4.10a: 

C2:  n>0,  z«l,  t*l  ->  g(z,t,n)-nl 

Bow  can  the  expression  nl  be  rewritten  in  terms  of  1,  1  and  n? 

To  obtain  the  correct  general  function,  the  expression  would  have 
to  be  rewritten  as  (l*nl)/(ll)  which  seems  much  less  intuitive 
than  that  required  for  the  TO  version.  As  another  point  of  com¬ 
parison,  consider  the  second  constraint  function  for  the  TO 
exponentiation  program  B  from  Example  4.11: 


-66- 


C2:  y>0,  w~x,  t*y-l  ->  g(w,  t,  x)»xAy 

and  the  second  constraint  function  for  the  BU  exponentiation  pro¬ 
gram  A  from  the  same  example: 

C2:  y>0,  w-x,  t«l  ->  g(w,  t,x,y)»xAy. 

In  both  cases,  the  proper  loop  function  may  be  obtained  by  using 
the  REWRITE  rule  xAy  ■  x*(xA(y-l));  however,  this  particular  rule 
seems  more  strongly  suggested  in  the  constraint  function  for  the 
TD  program. 

We  remark  that  the  same  general  phenomenon  occurs  with  TD 
programs  in  the  event  the  control  data  has  been  SIMPLIFIED  out  of 
the  domain  requirement  for  C2.  In  this  case,  the  fact  that  the 
control  data  represents  a  slightly  less  coup  lex  instance  of  the 
general  problem  being  solved  manifests  itself  in  the  function 
expression  for  the  SIMPLIFIED  C2  being  a  slightly  more  complex 
instance  of  the  problem  being  solved.  For  example,  the  con¬ 
straint  function  C2  above  for  the  TD  exponentiation  program  B  of 
Example  4.11  can  be  SIMPLIFIED  to 

t>«0,  w»x  ->  g(w,  t,x)  »xA  (t+l) . 

Before,  the  appearance  of  y-1  in  the  domain  requirement  suggested 
rewritting  xAy  as  x*  (xA  (y-1) ) .  Here,  the  appearance  of  t+l  in 
the  function  expression  suggests  rewritting  x  (t+l)  as  x*(xAt) 
(see  also  Examples  4.1  and  4.2). 

Suppose  f  is  the  operation  or  function  the  initialized  loop 
program  is  intended  to  compute,  in  Section  4.2  we  observed  that 
each  REWRITE  in  the  examples  of  that  section  involved  "decompos¬ 
ing"  an  application  of  f.  This  decomposition  corresponds  to 
rewritting  that  problem  instance  in  terms  of  a  slightly  less  com¬ 
plex  problem  instance  (or  instances).  In  general,  of  course, 
there  are  many  ways  this  decomposition  can  be  performed.  In  the 
examples  of  that  section,  however,  as  with  all  TD  programs,  the 
nature  of  the  control  data  serves  to  guide  this  decomposition  and 
thus  tends  to  make  the  REWRITE  step  quite  straightforward  in 
practice. 

The  reader  may  have  noticed  that  the  general  loop  functions 
for  the  BU  factorial  and  exponentiation  programs  contain  more 
program  variables  and  operations  on  those  variables  than  their  TD 
counterparts.  For  instance,  the  general  loop  functions  for  the 
BU  and  ID  factorial  programs  are 

0<*t<«n  ->  g(z,  t, n)«z*  (nl/tl) 

and 

0<»t  ->  g(z,t)»z*tl 

respectively.  This  fact,  by  itself,  helps  explain  why  the  loop 
function  synthesis  technique  seems  more  difficult  to  apply  on  BU 
programs.  It  would  be  a  mistake,  however,  to  assume  that  the  BU 
programs  are  more  "complex"  or  are  more  difficult  to  analyse  or 
prove,  we  consider  TD  loops  to  be  somewhat  more  susceptible  to 
the  form  of  induction  employed  in  functional  loop  verification. 
More  precisely,  the  inductive  hypothesis  required  in  this  type  of 
proof  (i.e.  a  general  statement  concerning  the  loop  input/output 
behavior)  seems  to  be  more  easily  stated  for  TD  programs  than  for 


BU  program®.  On  the  other  hand,  BU  programs  seem  somewhat  more 
susceptible  to  an  inductive  assertion  proof.  The  inductive 
hypothesis  required  in  this  type  of  proof  (i.e.  a  sufficiently 
strong  loop  invariant)  involves  fewer  program  variables  and 
operations  on  those  variables  than  the  same  type  of  hypothesis 
for  the  corresponding  TO  loop.  As  an  example  the  BU  and  TO  fac¬ 
torial  programs  have  adequate  loop  invar  ts  0<«t  &  z*t!  and 
0<»t<»n  &  z»nl/tl  respectively. 

In  [Manna  &  Waldinger  70]  ,  the  authors  describe  a  program 
synthesis  technique  and  point  out  that  their  method  produces 
either  of  the  above  factorial  programs  depending  upon  which  type 
induction  rule  the  synthesizer  is  given  to  employ. 

4.6.  Related  Work 

In  [Basu  &  Misra  76,  Misra  78,  Misra  79],  the  authors 
describe  two  classes  of  "naturally  provable"  programs  for  which 
generalized  loop  specifications  can  be  obtained  in  a  determinis¬ 
tic  manner.  The  technique  proposed  in  this  chapter  sacrifices 
determinism  in  favor  of  wide  applicability  and  ease  of  use.  It 
handles  in  a  fairly  straightforward  manner  typical  programs  in 
these  two  program  classes  (e.g.  Examples  4. 1-4. 3)  as  well  as  a 
number  of  programs  which  do  not  fit  in  either  of  the  classes 
(e.g.  Examples  4. 4-4. 6). 

Due  to  the  close  relationship  between  loop  functions  and 
loop  invariants  (as  discussed,  for  example,  in  Chapter  2),  any 
technique  for  synthesizing  loop  invariants  can  be  viewed  as  a 
technique  for  synthesizing  general  loop  functions  (and  vice 
versa).  In  this  light,  our  method  bears  an  interesting  resem¬ 
blance  to  a  loop  invariant  synthesis  technique  described  in  [ Weg- 
breit  74,  Katz  &  Manna  76]  .  in  this  technique  stronger  and 
stronger  "approximations"  to  an  adequate  loop  invariant  are  made 
by  pushing  the  previous  approximation  back  through  the  loop  once, 
twice,  etc. 

By  way  of  illustration,  consider  the  exponentiation  program 
of  Example  4.2.  The  loop  exit  condition  can  be  used  to  obtain  an 
initial  loop  invariant  approximation 
d»0  ->  w»c0Ad0. 

This  approximation  can  be  strengthened  by  pushing  it  back  through 
the  loop  to  yield 

(d«0  ->  w»c0Ad0)  &  (d« 1  ->  w*c«c0Ad0). 

In  the  analysis  presented  in  Example  4.2,  we  obtained  a  value  for 
the  generalized  function  specification  for  each  of  two  different 
values  of  the  initialized  variable  w  (i.e.  1  and  9QRT(c));  here 
we  have  obtained  a  "value”  for  the  loop  invariant  we  are  seeking 
for  each  of  two  different  values  of  the  variable  which  controls 
the  termination  of  the  loop  d.  Applying  the  analysis  in  [Morris 
&  wegbreit  77]  ,  these  loop  invariant  "values"  can  be  translated 
to  constraint  functions  as  follows* 
d«0  ->  g(w,c,d)-w, 
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d«l  ->  g(w,c,d)  «w*c. 

Of  course,  the  function  expression  w*c  in  the  second  constraint 
can  be  rewritten  w*(cAl);  SUBSTITUTING  as  usual  suggests  the  gen¬ 
eral  loop  function 

g(w,c,d)  «w*(c*d) . 

If  we  then  add  the  program  precondition  as  a  domain  restriction 
on  this  function,  the  result  is  the  same  general  loop  function 
discovered  in  Example  4.2. 

We  summarize  the  relationship  between  these  two  techniques 
as  follows.  As  the  initialized  loop  in  question  operates  on  some 
particular  input,  let  X[0]  ,  XII]  ,  a.  ,X[N]  be  the  sequence  of 
states  on  which  the  loop  predicate  is  evaluated  (i.e.  the  loop 
body  executes  N-l  times).  Of  course,  in  X[0]  ,  the  initialized 
variables  have  their  initial  values,  and  in  X[N],  the  loop  predi¬ 
cate  evaluates  to  FALSE.  The  method  proposed  in  this  chapter 
suggests  inferring  the  unknown  loop  function  g  from  X[0]  ,  X[l]  , 
g(X[0] )  and  g(X[l]).  The  loop  invariant  technique  described 
above,  when  viewed  as  a  loop  function  technique,  suggests  infer¬ 
ring  g  from  X [N] ,  X[N-1]  ,  g(X[N] )  and  g(x[N-l]).  Speaking 

roughly  then,  one  technique  uses  the  first  several  executions  of 
the  loop,  the  other  uses  the  last  several  executions.  One 
ignores  the  information  that  the  loop  must  conpute  the  identity 
function  on  inputs  where  the  loop  predicate  is  FALSE,  the  other 
ignores  the  information  that  the  loop  must  conpute  like  the  ini¬ 
tialized  loop  when  initialized  variables  have  their  initial 
values. 

Earlier  we  discussed  "top  down”  and  "bottom  up”  approaches 
to  synthesizing  g  and  indicated  that  our  technique  fit  in  the 
"top  down"  category.  The  technique  based  on  the  last  several 
iterations  is  a  "bottom  up"  approach.  It  is  difficult  to  care¬ 
fully  state  the  relative  merits  of  these  two  opposing  techniques. 
In  cur  view,  however,  there  are  a  number  of  circumstances  under 
which  the  technique  based  on  the  first  several  loop  executions 
seems  more  "natural"  and  easily  applied.  These  exanples  include 
the  NODES  program,  the  program  to  conpute  Acker  mannas  function 
and  the  TO  factorial  program  discussed  above.  The  reason  is  that 
a  critical  aspect  of  the  general  loop  function  is  the  function 
conputed  by  the  initialized  loop  program  (e.g.  exponentiation  in 
the  above  illustration)  .  In  the  technique  based  on  the  first 
several  iterations,  this  function  appears  explicitly  in  the  con¬ 
straint  functions.  In  the  other  technique,  this  information  must 
somehow  be  inferred  from  the  corresponding  constraint  functions 
(e.g.  by  looking  for  a  pattern  in  these  functions,  etc.).  This 
difficulty  is  inherent  in  any  "bottom  up"  approach  to  synthesiz¬ 
ing  g. 


In  Chapter  3,  we  presented  several  guidelines  for  the  syn¬ 
thesis  of  reduction  hypotheses.  Heuristic  SHORTCUT  of  Theorem 
3.2  in  that  chapter  is  based  on  the  observation  that  the  programs 
A:  PI;  B:  PI;  P2; 

while  Bl  do  while  B1  do 
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,V 


P2  P2 

od  od 

are  equivalent  on  inputs  which  require  at  least  one  loop  itera¬ 
tion  on  program  A.  This  is  the  underlying  VT!ILE  loop  property 
that  is  used  to  obtain  constraint  function  C2  in  the  loop  func¬ 
tion  derivation  technique  described  in  this  chapter.  Specifi¬ 
cally,  C2  for  the  program  A  is  exactly  constraint  function  Cl  for 
the  program  B. 

4.7.  Discussion 


In  this  chapter  we  have  proposed  a  technique  for  deriving 
functions  which  describe  the  general  behavior  :f  a  loop  which  is 
preceded  by  initialization.  These  functions  c  .n  be  used  in  a 
functional  [Mills  75]  or  subgoal  induction  [Morris  &  Wegbreit  77] 
proof  of  correctness  of  the  initialized  loop  program.  It  is  not 
our  intention  to  imply  that  verification  should  occur  after  the 
programming  process  has  been  conpleted.  There  are,  however,  a 
large  number  of  existing  programs  which  must  be  read,  understood, 
modified  and  verified  by  "maintenance"  personnel.  Me  offer  the 
heuristic  as  a  tool  which  is  intended  to  facilitate  these  tasks. 

It  has  been  argued  [Misra  78]  that  the  notion  of  closure  of 
a  loop  with  respect  to  an  input  domain  is  fundamental  in  analyz¬ 
ing  the  loop.  In  Section  4.3,  this  idea  is  applied  to  initial¬ 
ized  loop  programs.  The  result  is  that  a  loop  function  g  for  a 
loop  which  is  N-closed  (for  some  N>0)  can  be  synthesized  in  a 
deterministic  manner  by  considering  the  first  N  constraint  func¬ 
tions.  Hence  this  categorization  can  be  viewed  as  one  measure  of 
the  "degree  of  difficulty"  involved  in  verifying  initialized  loop 
programs. 

An  interesting  direction  for  future  research  is  the  develop¬ 
ment  of  a  precise  characterization  of  programs  which  are  not 
"tricky"  (as  discussed  in  Section  4.4).  Some  results  along  this 
line  are  described  in  Chapter  5  (see  also  [Basu  80]). 

In  Section  4.5  we  discussed  on  an  informal  level  the  oppos¬ 
ing  BO  and  TD  problem-solving  strategies  and  their  corresponding 
initialized  loop  realizations.  Me  argued  that  the  TD  approach 
appeared  to  be  more  widely  applicable  and  that,  in  practice,  TD 
programs  seem  to  occur  more  frequently.  Me  explained  the  success 
of  the  proposed  loop  function  creation  technique  on  these  pro¬ 
grams  in  terms  of  an  easily  applied  REM  RITE  step.  These  results 
are  offered  to  help  support  our  view  that  the  technique  may  be 
enployed  successfully  in  a  wide  range  of  applications. 
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Consider  the  problem  of  proving  /disproving  a  WHILE  loop 
correct  with  respect  to  some  functional  specification  f,  i.e.  f 
requires  the  output  variable (s)  to  be  some  function  of  the  inputs 
to  the  loop.  If  the  loop  precondition  is  weak  enough  so  that  the 
domain  of  f  contains  the  intermediate  states  which  appear  after 
each  loop  iteration  (i.e.  if  the  loop  precondition  is  a  loop 
invariant),  the  loop  is  said  to  be  closed  for  the  domain  of  f. 
In  Chapter  2  (Theorem  2.1)  we  saw  that  if  the  loop  is  closed  for 
the  domain  of  its  specification,  there  are  two  easily  constructed 
verification  conditions  based  solely  on  the  specification,  loop 
predicate  and  loop  body  which  are  necessary  and  sufficient  condi¬ 
tions  for  the  correctness  of  the  loop  (assuming  termination)  with 
respect  to  its  specification  [Mills  75,  Misra  78).  If  the  loop 
is  not  closed  for  the  domain  of  the  specification  function,  a 
generalized  specification  (i.e.  one  that  implies  the  original 
specification)  which  satisfies  the  closure  requirement  must  be 
discovered  before  these  verification  conditions  can  be  con¬ 
structed  (this  problem  is  analogous  to  that  of  discovering  an 
adequate  loop  invariant  for  an  inductive  assertion  proof  [Hoare 
69]  of  the  program) . 

We  remark  that  the  restricted  specification  often  occurs  in 
the  process  of  analyzing  an  initialized  WHILE  loop,  i.e.  one  that 
consists  of  a  WHILE  loop  preceded  by  some  initialization  code. 
This  initialization  typically  takes  the  form  of  assignments  of 
constant  values  to  some  of  the  variables  manipulated  by  the  loop. 
Examples  include  setting  a  counter  to  zero,  a  search  flag  to 
FALSE,  a  queue  variable  to  some  particular  configuration,  etc. 
It  is  clear  that  the  initialized  loop  is  correct  with  respect  to 
some  spec ification  if  and  only  if  the  WHILE  loop  by  itself  is 
correct  with  respect  to  a  slightly  modified  epee  ification.  This 
specification  has  the  same  postcondition  as  the  original  specifi¬ 
cation  and  a  precondition  which  is  the  original  precondition 
together  with  the  condition  that  the  initialized  variables  have 
their  initialized  values.  Since  the  initialized  variables  will 
typically  assume  other  values  as  the  loop  iterates,  the  loop  most 
likely  will  not  be  closed  for  the  domain  of  this  specification 
and  a  generalization  of  it  will  be  necessary  in  order  to  verify 
the  correctness  of  the  program. 


Example  5.1  -  The  following  program  multiplies  natural 

numbers  using~"repeated  addition: 

{v>«0,k>»0} 

z  :■  0; 

while  v  >  0  do 
z  :■  z  +  ITT 
v  :■  v  -  1 
od 

{z«vti*k} . 


The  term  vO  appearing  in  the  postcondition  refers  to  the  initial 
value  of  v.  The  program  is  correct  if  and  only  if 

{z»0,  v>-0,  k>*0} 
while  v  >  0  do 
z  :«  z  +  TcT 
v  :»  v  -  1 
od 

{z»v5*k} 

is  correct.  Since  this  loop  precondition  requires  z  to  have  the 
value  0  and  z  assumes  other  values  as  the  loop  executes,  the  loop 
is  not  closed  for  this  precondition.  Thus,  before  this  program 
can  be  verified  using  the  above  mentioned  technique,  this  specif¬ 
ication  must  be  generalized  to  something  like 

{ v>*0,k>-0} 
while  v  >  0  do 
z  :*  z  +  TcT 
v  :»  v  -  1 
od 

{z*ztT  +  vO*k} 

where  zO  refers  to  the  initial  value  of  the  variable  z. 

The  approach  to  this  problem  suggested  here  is  one  of 
observing  how  particular  changes  in  the  value  of  some  input  vari¬ 
able  (e.g.  z  in  the  example)  affect  the  result  produced  by  the 
loop  body  of  the  loop  under  consideration.  Clearly  in  general,  a 
change  in  the  value  of  an  input  variable  may  cause  an  arbitrary 
(and  seemingly  unrelated)  change  in  the  loop-body  result.  In 
many  commonly  occurring  cases,  however,  the  result  produced  by 
the  loop  body  is  "uniform"  across  the  entire  spectrum  of  possible 
values  for  the  input  variable.  It  is  this  property  that  will  be 
exploited  in  order  to  obtain  a  generalized  specification  for  the 
loop  being  analyzed.  The  generalizations  considered  here  have 
the  property  that  the  loop  is  correct  with  respect  to  the  gen¬ 
eralization  if  and  only  if  the  loop  is  correct  with  respect  to 
the  original  specification.  Thus  if  the  loop  is  closed  for  the 
domain  of  the  generalization,  the  program  can  be  proven/d isproven 
by  testing  its  correctness  relative  to  the  generalization. 

It  is  natural  to  expect  that  the  ease  with  which  a  general¬ 
ized  specification  may  be  obtained  for  a  loop  would  depend 
largely  on  the  nature  of  the  loop.  Results  in  [Wegbreit  77]  ,  for 
example,  show  that  the  problem  of  generalizing  the  loop  specifi¬ 
cation  for  any  program  in  a  particular  class  of  programs  is  NP- 
complete.  On  the  other  hand,  work  presented  here  and  elsewhere 
[Basu  &  Misra  76,  Misra  79,  Basu  80],  indicates  that  there  do 
exist  categories  of  loops  foe  which  generalized  specifications 
can  be  obtained  in  a  direct,  routine  manner.  We  feel  that  the 
notion  of  "uniform"  loop-body  behavior  discussed  in  this  chapter 
is  valuable  not  only  as  a  tool  by  which  such  generalizations  may 
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be  obtained,  but  also  as  an  attempt  at  a  characterization  of 
loops  which  are  susceptible  to  routine  analysis,  and  hence  in 
this  sense,  easy  to  verify  and  comprehend. 

The  following  section  defines  the  necessary  notation  and 
terminology  and  then  introduces  the  idea  of  a  generalized  loop 
specification.  Section  5.2  defines  a  uniformly  implemented  loop 
and  states  several  implications  of  this  definition  for  the  prob¬ 
lem  of  generalizing  a  specification  for  such  a  loop.  These 
results  are  applied  on  several  example  programs  in  Section  5.3. 
In  Section  5.4,  a  simplified  procedure  is  suggested  for 
proving /disproving  a  uniformly  implemented  loop  correct  with 
respect  to  the  obtained  generalization.  Finally,  several  guide¬ 
lines  for  recognizing  uniformly  implemented  loops  are  presented 
in  Section  5.5. 

,5.1.  Preliminaries 

We  will  consider  a  verification  problem  of  the  form 

{<z,y>  6  D(f) } 
while  B(<z,y>)  do 

z,y  :■  h " ( z , y)  ,h~(z,y) 
od 

{<z,y>  -  f  (<zO,yO>)} . 

In  this  problem,  f  is  a  data  -state -to -data  -state  function.  The 
data  state  consists  of  two  variables,  z  and  y.  The  terms  zO  and 
yO  refer  to  the  initial  values  of  z  and  y  respectively.  The 
effect  of  the  loop  body  is  partitioned  into  two  functions  h"  and 
h"  which  describe  the  new  values  of  z  and  y  respectively. 

The  loop  will  be  referred  to  as  P.  As  before,  the  data- 
state -to -data -state  function  confuted  by  the  loop  (which,  presum¬ 
ably,  is  not  explicitly  known)  will  be  denoted  [P ) .  Thus  D{  [P ] ) 
is  the  set  of  states  for  which  P  terminates.  As  a  shorthand 
notation  we  will  use  X  for  the  state  <z,y>,  and  H  for  the  data- 
state -to -data -state  function  confuted  by  the  loop  body,  i.e. 

H  (X)  »  H(<z,y>)  -  <h  "  ( z ,  y)  ,h‘"(z,y)>. 

Suppose  the  loop  is  not  closed  for  0(f)  in  that  this  set 
contains  only  a  restricted  collection  of  values  (maybe  only  one) 
of  z  and  that  other  intermediate  values  of  z  occur  as  the  loop 
iterates.  The  variable  z  will  be  called  the  key  variable .  Our 
goal  here  is  to  discover  some  more  general  specification  t'  which 
includes  each  of  these  intermediate  values  of  the  key  variable  in 
its  domain.  This  generalization  process  (in  one  form  or  another) 
is  necessary  for  a  proof  of  correctness  of  the  program  under  con¬ 
sideration. 

Definition  5.1  -  P  is  correct  wrt  a  function  f  iff  for  all  X 
in  D( f) ,  [P ]  (X)  Ts” defined  and  [P]  (X)«f  (X) . 


Definition  5.2  -  A  superset  £'  of  f  is  a  valid  generaliza¬ 
tion  of  f  iff  i^P^is  correct  wrt  f,  then  P  is  correct  wrt  £'. 

Note  that  the  collection  of  supersets  of  f  is  partially 
ordered  by  "is  a  valid  generalization  of."  The  following  defini¬ 
tion  supplies  the  notation  we  will  use  to  describe  generaliza¬ 
tions  of  the  specification  function  f. 

Definition  5.2  -  If  S  is  a  set  of  ordered  pairs  of  data 
states^  P  IS"  the  extension  of  f  defined  by  S  iff  £'  is  a  func¬ 
tion  and  f"  is  the  union  of  f  and~S.  “ 

Definition  2*1  -  If  9  is  the  extension  of  f  defined  by 

(5.1)  tlxTx)  |  ~B  (X) }  , 

then  g  is  the  base  generalization  of  f. 

Thus,  if  g  is  the  base  generalization  of  f,  then 
g(Xl)*X2  <->  (f (X1)=X2  OR  (~B (Xl)  &  Xl=X2) ) . 

Throughout  this  chapter,  we  will  continue  to  use  the  function 
symbol  g  for  the  base  generalization  of  f.  We  remark  that  g 
exists  provided  the  union  of  f  and  (5.1)  is  a  function,  i.e.  pro¬ 
vided 

X  e  D(f)  &  ~B  (X)  ->  f(X)*X 

holds.  If  this  condition  is  not  satisfied,  P  is  not  correct  wrt 
f.  Hence,  if  P  is  correct  wrt  f,  the  base  generalization  of  f 
exists. 

Theorem  jj.l  -  If  g  is  the  base  generalization  of  f,  then  g 
is  a  valid  generalization  of  f. 

Proof  -  Suppose  P  is  correct  wrt  f.  We  must  show  P  is 
correct  wrt  g.  Let  X  6  D(g)  .  If  X  B  D(f),  the  loop  handles  the 
input  correctly  by  hypothesis.  If  X  is  not  in  D(f),  we  must  have 
~B  (X)  and  g(X)*X.  Thus  the  program  and  g  map  X  to  itself  and 
thus  are  in  agreement.  Consequently  P  is  correct  wrt  g,  and  g  is 
a  valid  generalization  of  f. 

The  theorem  utilizes  the  fact  that  the  loop  must  necessarily 
compute  the  identity  function  over  inputs  where  the  loop  predi¬ 
cate  is  false.  Combining  this  information  with  the  program 
specification  f  results  in  a  valid  generalization  of  f. 

Definition  5.5  -  A  valid  generalization  £'  of  f  is  adequate 
if  the  loop  is  closed  for  D(f'). 

The  important  characteristic  of  an  adequate  valid  generali¬ 
zation  f'  is  that  it  can  be  used  to  prove  /disprove  the  correct¬ 
ness  of  P  wrt  the  original  specification  f.  Since  the  loop  is 
closed  for  D(f'),  P  can  be  proven/d isproven  correct  wrt  £'  using 
standard  techniques  (Mills  72,  Mills  75,  Basu  &  Misra  75,  Morris 
&  Wegbreit  77,  wegbreit  77,  Misra  78]  .  Specifically,  P  is 
correct  wrt  £'  iff  each  of 

(5.2)  the  loop  terminates  for  all  X  €  D(f') 


(5.3)  X  e  D(  f  ")  &  “B  (X)  ->  f"(X)«X 

(5.4)  X  e  D(f')  &  B  (X)  ->  f  "(X)«f  "(H  (X) ) 

hold.  If  P  is  correct  wrt  f",  then  P  is  necessarily  correct  wrt 
any  subset  of  f",  including  f.  If  P  is  not  correct  wrt  f",  then 
by  the  definition  of  a  valid  generalization,  P  must  not  be 
correct  wrt  f. 


Example  5.2  -  Bie  following  program  tests  whether  a  particu¬ 
lar  key  appears- in  an  ordered  binary  tree. 

{ success*F ALSE } 

while  tree  ^  NULL  &  “ success  do 

IT  name(tree)  =  key  then  success  :*  TRUE 
elseif  name  (tree)  <  key  then  tree  :*  right(tree) 
else  tree  :*  left  (tree)  fi 

qS 

{success  »  IN(treeO,key) } 


The  function  IN(treeO,key)  appearing  in  the  postcondit ion  is  a 
predicate  which  means  "the  ordered  binary  tree  treeO  contains  a 
node  with  name  field  key.”  The  boolean  variable  success  is 
chosen  as  the  key  variable  since  it  is  constrained  to  the  value 
FALSE  in  the  input  specification.  Thus  success  plays  the  role  of 
z  and  the  pair  of  variables  <tree,key>  correspond  to  y  in  the 
program  schema  discussed  above.  The  specification  function  f  is 
f ( <FALSE, tree, key >)  *  < IN (tree, key) , tree", key "> 
where  tree"  and  key"  are  the  final  values  of  the  variables  tree 
and  key  computed  by  the  loop,  respectively.  That  is,  since  the 
final  values  of  these  variables  are  not  of  interest  in  this  exam¬ 
ple,  we  specify  these  final  values  so  as  to  be  automatically 
correct.  Using  theorem  5.1,  a  valid  generalization  of  this 
specification  is 

g ( <success, tree, key>)  *  if  “success  then 

<IN (tree,  key)  ,  tree",  key"> 
else  if  tree»NULL  OR  success  then 
ouccess,  tree,  key> , 

which  is  equivalent  to 

g  ( ouccess,  tree,  key>)*<success  OR  IN  (tree,  key) ,  tree",  key  ">. 


In  this  example,  the  domain  of  the  base  generalization  g  of 
f  includes  each  value  of  the  key  variable,  (i.e.  FALSE  and  TRUE) 
and  is  thus  adequate.  Consequently,  this  generalization  can  be 
used  to  prove /disprove  the  correctness  of  the  program. 


In  most  cases,  however,  the  heuristic  suggested  in  Theorem 
5.1  is  insufficient  to  generate  an  adequate  generalization. 
Indeed,  the  base  generalization  is  an  adequate  generalization 
only  in  the  case  when  the  sole  reason  for  the  closure  condition 
not  holding  is  the  existence  of  potential  final  values  of  the  key 
variable  (e.g.  TRUE  in  the  example)  which  are  absent  from  0(f). 
In  order  to  obtain  a  generalization  that  includes  general  values 
of  the  key  variable,  an  important  characteristic  of  the  loop  body 
which  seems  to  be  present  in  many  commonly  occurring  loops  will 
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be  exploited. 

5.2.  Uniformly  Implemented  Loops 

Definition  5.6  -  Let  P  be  a  loop  of  the  form  described 
above.  Let  C  be  a  set/  let  Z  be  the  set  of  values  the  key  vari¬ 
able  z  may  assume/  and  let  Y  be  the  set  of  values  the  remaining 
variable(s)  y  may  assume.  Let 
$'  :  C  x  Z  ->  Z 

be  an  infix  binary  operator.  The  loop  P  is  uniformly  implemented 
wrt  $"  iff  each  of 

(5.5)  B  ( <z , y> )  ->  h'  (c  $"  Z/y)  -  c  $'  h "  (z,y) 

(5.6)  B ( <z , y> )  ->  h~(c  $'  z,y)  -  h~(z,y) 

(5.7)  B  ( <z / y> )  ->  B  ( <c  z,y>) 

hold  for  all  c  6  C,  z  e  Z  and  all  y  6  Y. 

Conditions  (5.5)  and  (5.6)  of  this  definition  state  that  a 
modification  to  the  key  variable  by  the  operation  causes  a 
slight  but  orderly  change  in  the  result  produced  by  the  loop 
body.  The  change  is  slight  because  the  only  difference  in  the 
result  produced  by  the  loop  body  occurs  in  the  key  variable. 
This  difference  is  orderly  because  it  corresponds  precisely  to 
the  same  $ *  operation  that  served  to  modify  the  input  value  of 
the  key  variable.  Condition  (5.7)  specifies  that  such  a  modifi¬ 
cation  does  not  cause  the  loop  predicate  B  to  change  from  true  to 
false. 

As  a  shorthand  notation  we  define  the  infix  operator  $  as 
c  $  X  ■  c  $  <z,y>  *  <c  $"  zry>. 

In  this  notation  (5.5) -(5.7)  are  equivalent  to 

(5.8)  B  (X)  ->  c  $  H  (X)  -  H(c  $  X) 
and 

(5.9)  B  (X)  ->  B(c  $  X)  . 


Example  5.3  -  Consider  again  the  program  from  Example  5.1 
which  multiplies  natural  numbers  using  repeated  addition: 

{z»0/V>«0/k>»0} 
while  v>  0  do 
z  :■  z  +  k; 
v  :■  v  -  1 
od 

{z«v(S*k}  . 

Let  z  be  the  key  variable.  The  pair  <vrk>  corresponds  to  the 
variable  y  occurring  in  the  above  schema.  The  loop  is  uniformly 
implemented  wrt  +,  where  c  and  z  are  both  the  set  of  natural 
numbers.  Note  that  adding  some  constant  to  the  input  value  of  z 
has  the  effect  of  adding  the  same  constant  to  the  value  of  z  out¬ 
put  by  the  loop  body.  Now  consider  the  following  alternative 
implementation  of  multiplication: 


{z»0,v>»0,k>»0} 
while  v>  0  do 

if  Tc«0  &  z*0  then  z  :*  v  -  1 

else IF  k-0  &  z^O  then  z  :■  z  -  1 

else if  z<k  then  z  :»  z  +  k 

else If  z«k  then  z  :■  z  *  2  *  v 

else  z  :«  z  -  k  fi; 

v  :■  v  -  1 
od 

{z»v5*k} . 

Again,  let  z  be  the  key  variable.  This  loop  is  not  uniformly 
implemented  wrt  +.  Intuitively,  this  is  due  to  the  high  degree 
of  dependence  of  the  loop-body  behavior  on  the  value  of  the  key 
variable.  The  result  of  this  dependence  is  that  adding  some  con¬ 
stant  to  the  value  of  z  causes  an  unorderly  change  in  the  value 
of  z  output  by  the  loop  body. 

The  reader  may  wonder  if  the  second  nu  It  ip  lie  at  ion  program 
above  might  be  uniformly  implemented  wrt  some  operation  other 
than  +.  We  remark  that  any  loop  is  uniformly  implemented  wrt  $“ 
:  C  x  Z  ->  Z  defined  by 
c  z  *  z 

for  all  c  e  C  and  z  e  Z.  For  the  purpose  of  our  research,  we 
rule  out  such  trivial  operations,  i.e.  we  require  that  for  some  z 
e  Z,  there  exists  some  c  e  C  such  that 
c  z  ft  z. 

With  this  assumption,  there  does  not  exist  an  operation  wrt  which 
the  second  of  the  above  loops  is  uniformly  implemented  (or  more 
briefly,  the  loop  is  not  uniformly  implemented)  .  To  see  this, 
suppose  the  loop  were  uniformly  implemented  wrt  $"  :  C  x  Z  ->  Z. 
Let  cO  and  zO  be  some  fixed  elements  of  C  and  Z,  respectively, 
vhich  satisfy 

cO  $'  zO  /  zO. 

Since  (5.5)  must  hold  for  all  c  e  C,  z  6  Z  and  all  y  e  Y,  we 
choose  z*zO,  c»cO  and  k*cO  zO.  Applying  (5.5)  gives 
v>0  ->  h'(cO  zO,<v,cO  $"  zO>)»cO  h'(zO,<v,cO  $'  z0>) 
for  all  v.  We  consider  3  exhaustive  cases  based  on  the  values  of 
cO  and  zO.  First,  suppose  cO  $"  zO  *  0.  Then  we  must  have 
(since  zO  f  0) 

v> 0  ->  v-l»c0  $“  (zO-1). 

Since  this  must  hold  for  all  v  and  the  value  cO  $'  (zO-1)  is 

fixed  by  the  original  selection  of  cO  and  zO,  this  is  a  contrad¬ 
iction.  Next,  suppose  cO  $"  zO  ft  0  and  zO  <  cO  $'  zO.  Then 
v> 0  ->  (cO  $'  z0)*2*v»c0  $'  (z0+(c0  $“  zO)) 
for  all  v.  Again,  since  the  expression  to  the  right  of  the 
equality  sign  is  fixed  and  (cO  z0)*2*v  varies  with  different 
values  of  v  (cO  $'  zO  is  nonzero),  this  is  a  contradiction.  The 
third  case,  where  cO  $"  zO  f  0  and  zO  >  cO  $'  zO,  leads  to  a 
similar  contradiction,  and  thus  (5.5)  does  not  hold.  We  conclude 
that  the  second  multiplication  program  above  is  not  uniformly 
implemented.  That  is,  there  does  not  exist  a  nontrivial  modifi¬ 
cation  that  can  be  applied  to  the  variable  z  which  always  results 


77- 


in  a  slight  and  orderly  change  in  the  result  produced  by  the  loop 
body. 


The  results  presented  here  are  based  on  the  following  lemma 
concerning  uniformly  implemented  loops.  The  lemma  describes  the 
output  of  a  uniformly  implemented  loop  P  for  some  modified  input 
c  $  X  (i.e.  IP]  (c  $  X))  in  terms  of  the  output  of  the  loop  for 
the  input  X  (i.e.  IP](X))  and  the  output  of  the  loop  for  the 
input  c  $  [P](X)  (i.e.  IP]  (c  $  [P]  (X) ) ) . 

Lemma  5.1  -  Let  P  be  uniformly  implemented  wrt  Then 

(5.10)  x"e~D([Pl)  ->  [P]  (c  $  X)-[P]  (c  $  [P]  (X) ) . 

Proof  -  We  use  induction  on  the  number  of  iterations  of  P  on 
X.  For  the  base  case  of  0  iterations/  [P](X)»X,  and  the  lemma 

holds.  Suppose  it  holds  for  all  input  data  states  X  requiring 

n-1  iterations  where  n  >  0.  Let  XI  require  n  iterations.  Since 
n  >  0,  B(X1)  holds.  By  (5.9),  B  (c  $  XI).  Note  that  H(Xl) 
requires  n-1  iterations  on  P;  thus  by  the  inductive  hypothesis 
[P]  (c  $  H  (Xl) )  -  (P](c$  [P ]  (H  ( XI ) ) ) . 

Due  to  the  uniform  implementation  this  is 

[P ]  (H  (c  $  XI))  -  (P](C  $  [P  ]  (H  ( Xl )  )  )  . 

Using  the  loop  property  B(X)  ->  [P]  (X)*  [P ]  (H  (X) )  on  both  sides  we 

get 

[P](c$x  1)  -  !P](c$  [P]  (Xl) ) . 

Thus  the  inductive  step  holds  and  the  lemma  is  proved. 

The  general  idea  behind  our  use  of  the  lemma  is  as  follows. 
Suppose  the  value  IP]  (X)  is  known  for  some  particular  X.  I.e. 
suppose  we  know  what  the  loop  produces  for  the  input  X.  In  addi¬ 
tion,  suppose  that,  given  the  result  [P]  (X) ,  the  quantity  [P]  (c  $ 
IP]  (X) )  is  also  known.  With  this  information,  we  can  then  use 
Lemma  5.1  to  "solve"  for  the  (possibly  unknown)  value  IP]  (c  $  X)  . 
This  additional  information  concerning  the  input/output  behavior 
of  the  loop  can  be  used  as  an  aid  in  constructing  a  valid  gen¬ 
eralization  of  the  specification  f. 

How  can  we  find  the  value  {P](X)  and  then  the  value  IP]  (c  $ 
[P]  (X) )  for  some  X?  The  key  lies  in  assuming  the  loop  P  is 
correct  wrt  f.  If  P  is  not  correct  wrt  f,  any  generalization  of 
f  obtained  by  the  technique  will  be  a  valid  generalization  by 
definition,  under  this  assumption,  IP]  (X)  is  known  for  X  e  D(f), 
i.e.  X  6  D(f)  ->  [P ]  (X) “f  (X)  ,  and  hence  Lemma  5.1  implies 

(5.11)  X  6  D(f)  ->  lP](c  $  X)-(P](c  $  f  ( X) ) . 

Consider  now  the  base  generalization  g  of  f  defined  in  the 
previous  section.  Recall  that  g  is  simply  f  augmented  with  the 
identity  function  ever  the  domain  where  the  loop  predicate  B  is 
false.  Assuming  as  before  that  P  is  correct  wrt  f,  P  is  then 
correct  wrt  g  by  Theorem  5.1;  hence  X  €  D(g)  ->  IP]  (X)«g(X) . 
Thus  (5.11)  implies 

(5.12)  X  6  D(f)  &  c  $  f(X)  e  Dig)  ->  IP]  (c  $  X)-g(c  $  f(X)). 

Thus  we  can  "solve"  for  the  behavior  of  the  loop  on  the  input  c  $ 
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X,  assuming  X  €  D( f) ,  c  $  f(X)  6  0(g)  and  P  is  correct  wrt  f. 
This  suggests  that  i£  £'  is  the  extension  of  f  defined  by 

(5.13)  {(c  $  X,g(c  $  f(X)))  1  X  6  D(f)  6  c  $  f  (X)  6  D(g) } , 
then  £'  is  a  valid  generalization  of  f.  Before  giving  a  formal 
proof  of  this  result,  however,  we  first  consider  the  question  of 
the  existence  of  such  an  extension  of  f.  Specifically,  it  could 
be  that  for  some  c  and  X  satisfying  X  6  D(f)  and  c  $  f(X)  e  0(g), 
that  c  $  X  6  D(f)  and  f(c  $  X)  ?  g(c  $  f(X)).  This  would  imply 
that  the  extension  of  f  defined  by  (5.13)  does  not  exist.  The 
following  theorem  states  that  this  implies  P  is  not  correct  wrt 
f. 

Theorem  5.2  -  Let  P  be  uniformly  implemented  wrt  $".  Let  g 
be  tKe  base-  generalization  of  f.  If  P  is  correct  wrt  f,  then 
there  exists  a  function  f“  which  is  the  extension  of  f  defined  by 

(5.13) ,  i.e. 

{  (c  $  X,g(c  $  f(X)))  |  X  6  D(f)  &  c  $  f  ( X)  6  D(g)  }  . 

Proof  -  Let  f "  be  the  function  computed  by  the  loop,  i.e. 
CPI-  Since  P  is  correct  wrt  f,  P  is  correct  wrt  g,  and  f“  is  a 
superset  of  both  f  and  g.  By  the  lemma 
f'(c  $  X)  »  f'(c  $  f'(X)) 

for  all  X  6  D(f).  Since  f'(X)-f(X)  for  X  e  D(f)  and  f'(X)»g(X) 
for  X  e  D(g) , 

X  e  D(f)  &  c  $  f(X)  e  D(g)  ->  f > ( c  $  X)«g(c  $  f(X)) 
holds.  Thus  (5.13)  is  a  subset  of  f'.  Since  f  is  a  subset  of 
£',  the  union  of  f  and  (5.13)  is  a  subset  of  £'.  Hence  this 
union  is  a  function  and  is  thus  the  extension  of  f  defined  by 

(5.13) . 

The  following  theorem  is  the  central  result  presented  here. 
The  theorem  formalizes  the  use  of  Lemma  5.1  in  the  manner  sug¬ 
gested  above,  i.e.  that  the  extension  of  f  described  in  the  pre¬ 
vious  theorem  is  a  valid  generalization  of  the  original  specifi¬ 
cation. 

Theorem  5.3  -  Let  P  be  uniformly  implemented  wrt  $“*.  Let  g 
be  the  base  generalization  of  f.  If  £'  is  the  extension  of  f 
defined  by  (5.13),  i.e. 

{  (c  $  X,g(c  $  f  ( X) ) )  |  X  6  D(f)  &  c  $  f  ( X)  6  D(g) }  , 

then  £'  is  a  valid  generalization  of  f. 

Proof  -  Suppose  P  is  correct  wrt  f.  We  must  show  P  is 
correct  wrt  £'.  Let  X  6  D(f').  If  X  6  D(f),  the  loop  handles 
the  input  correctly  by  hypothesis.  If  X  is  not  in  D(f),  we  must 
have  X  ■  c  $  XI  where  XI  6  D(f)  and  c  $  f(Xl)  6  0(g).  By  Lemma 
5.1,  [P]  (X)  ■  IP]  (c  $  [P]  (XI)}.  Since  P  is  correct  wrt  f  this  is 
[P]  (X)  ■  [P]  (c  $  f  ( xl) ) .  By  Theorem  5.1,  P  is  correct  wrt  g. 
Using  this,  the  equality  can  be  written  as  [P]  (X)  ■  g(c  $  f(Xl)). 
By  the  definition  of  £*  this  implies  [P]  (X)  -  f“(X).  Thus  P  and 
£'  are  in  agreement  on  the  input  X  and  consequently  are  in  agree¬ 
ment  on  any  input  in  Off').  Hence  P  is  correct  wrt  £'  and  thus 
£'  is  a  valid  generalization  of  f. 


The  significance  of  Theorem  5.3  is  that  it  provides  a  guide¬ 
line  for  generalizing  the  specification  of  a  uniformly  imple¬ 
mented  loop.  If  the  loop  is  closed  for  the  domain  of  the  result¬ 
ing  specification,  the  generalization  can  then  be  used  to 
prove/disprove  the  program  correct  wrt  the  original  specifica¬ 
tion. 


1*1*  Applications 

In  this  section  we  illustrate  the  use  of  Theorem  5.3  with  a 
number  of  example  programs  which  fall  into  either  of  two 
subclasses  of  uniformly  implemented  loops.  The  subclasses 
correspond  to  the  two  possible  circumstances  which  can  occur  when 
c  $  f  ( X)  of  set  definition  (5.13)  belongs  to  the  set  D(g)  :  the 

first,  because  ~B(c  $  f(X)),  and,  the  second,  because  c  $  f(X)  8 
D(f).  In  each  of  these  situations,  the  set  definition  (5.13) 
takes  on  a  particularly  simple  form. 

Definition  5.7  -  A  uniformly  implemented  loop  satisfying 
~B  (X)  ->~“B  (c  $  X) 
is  a  Type  A  loop. 

Observe  that  this  condition  along  with  (5.9)  indicates  that 
a  Type  A  uniformly  implemented  loop  satisfies 
B(X)  <->  B  (c  $  X)  , 

i.e.  the  value  of  the  loop  predicate  B  is  independent  of  a  change 
to  the  data  state  by  the  operator  $. 

The  intuition  behind  a  Type  A  uniformly  implemented  loop  is 
as  follows.  Whenever  an  execution  of  a  Type  A  loop  terminates 
(i.e.  “B (X)  holds)  and  the  resulting  data  state  is  modified  by 
the  operator  $,  the  result  is  a  new  data  state  which,  when  viewed 
as  a  loop  input,  corresponds  to  zero  iterations  of  the  loop  (i.e. 
the  predicate  B  is  still  false  despite  the  modification).  This 
property  is  reflected  in  the  following  corollary. 

Corollary  5.3-1  -  Let  P  be  a  Type  A  loop.  If  f  '  is  the  ex- 
tension  of  f  ae"5ine3f  by 

(5.14)  {  (c  $  X,c  $  f(X))  1  X  e  D(f)}, 

then  £*  is  a  valid  generalization  of  f. 

Proof  -  The  proof  consists  of  showing  that  (5.13)  and  (5.14) 
are  equivalent  for  a  Type  A  loop  which  is  correct  wrt  f.  By 
Theorem  5.3,  the  corollary  then  holds.  Let  P  be  a  Type  A  loop 
which  is  correct  wrt  f.  A  consequence  of  the  correctness  proper¬ 
ty  is  that  ~B  ( f  ( X) )  for  all  X  6  D(f).  Since  P  is  a  Type  A  loop, 
this  implies  ~B(c  $  f(X)).  Thus  c  $  f(X)  6  D(g)  and  g(c  $ 
f  (X) )  ac  $  f(X).  Consequently  (5.13)  and  (5.14)  are  equivalent. 

Of  course,  once  a  generalization  f'  has  been  obtained  via 
Corollary  5.3-1,  there  is  no  reason  why  that  result  cannot  be  fed 
back  into  the  corollary  to  obtain  a  (possibly)  further  generali¬ 
zation  £"  (using  £'  for  f,  £"  for  £ ') .  This  notion  suggests 
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the  following  general  case  of  Corollary  5.3-1. 

Corollary  5.3-2  -  Let  P  be  a  Type  A  loop.  If  £'  is  the  ex- 
tens  ionorfa  el in  e2T  by 

{  (cl$  (c2$  ...  (cn$X)  ...  ),cl$(c2$  ...  (cn$f(X))  ...  ))  I 

X  €  D(  f)  &  n>«0}  , 

then  f"  is  a  valid  generalization  of  f. 

Example  5.4  -  Consider  the  following  program  to  conpute  ex¬ 
ponentiation.- 


{ w*l,e>0,d>«0} 
while  d  >  0  do 

If  odd(d)  then  w  :»  w  *  e  fi  ? 
e  :»  e*e ;  d  s»d/2 
od 

{ w^eff- A  dO} 

The  infix  operator  A  appearing  in  the  postcondition  represents 
integer  exponentiation.  In  this  exanple,  w  plays  the  role  of  the 
key  variable  z,  and  the  pair  <e,d>  corresponds  to  the  variable  y. 
We  now  consider  wrt  what  operation  the  loop  might  be  uniformly 
implemented.  For  any  operation  $",  (5.7)  holds  (because  w  does 
not  appear  in  the  loop  predicate)  as  does  (5.6)  (because  the 
values  produced  in  e  and  d  are  independent  of  w) .  Furthermore, 
(5.5)  must  hold  for  inputs  which  bypass  the  updating  of  w.  Thus 
the  uniformity  conditions  reduce  to 

d  >  0  &  odd(d)  ->  (c  $"  w)  *  e  »  e  $'  (w  *  e) 

Due  to  its  associativity,  it  is  clear  the  loop  is  uniformly  im¬ 
plemented  wrt  *,  where  the  sets  C  and  Z  are  the  set  of  integers. 
Since  the  key  variable  does  not  appear  in  the  loop  predicate,  it 
is  necessarily  a  Type  A  loop.  The  specification  function  here  is 
f(<l,e,d>)  »  <e  *  d,e',d‘’> 

where  e>0,  d>* 0  and  e'  and  d“  are  the  final  values  computed  by 
the  loop  for  the  variables  e  and  d.  Applying  Corollary  5.3-1,  the 
function  £'  defined  by 

f'(<e*l,e,d>)  -  <c*(e  *  d),e',d'> 
where  e>0  and  d»0  is  a  valid  generalization  of  f.  Since  this 
holds  for  all  c,  the  definition  of  £'  can  be  rewritten  as 
£'(<w,  e,d>)  «  <w*(e  *  d),e',d'>, 

where  w  is  an  arbitrary  integer,  e>0  and  d>«0.  The  generaliza¬ 
tion  £'  is  adequate  and  can  thus  be  used  to  test  the  correctness 
of  the  program  wrt  the  original  specification.  Applying  (5.2), 
(5.3)  and  (5.4)  from  above,  these  necessary  and  sufficient  verif¬ 
ication  conditions  are 

-  the  loop  terminates  for  all  e>0,  d>-0, 

-  d«0  ->  w»w*{e  A  d) ,  and 

-  w*(e  A  d)  is  a  loop  constant  (i.e.  eO  A  dO  ■  w*(e  A  d) 

is  a  loop  invariant)  , 

respectively.  In  Section  5.4,  we  will  discuss  a  simplification 
of  the  last  of  these  verification  conditions  which  applies  for 
uniformly  implemented  loops. 
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Example  5.5  [Misra  79]  -  The  following  program  constructs 
the  p reorder”  traversal  of  a  binary  tree  with  root  node  r.  The 
program  uses  a  stack  variable  st  and  records  the  traversal  in  a 
sequence  variable  seq. 

{seq-NULL,  st* (r)  /*stack  st  contains  only  the  root  node  r*/} 
while  st  ji  EMPTY  do 

p  <■  st;  /*pop“the  top  off  the  stack*/ 

seq  :•  seq  |  |  name(p)  ;  /Concatenate  name  of  p  onto  seq*/ 
if  right(p)  j  NIL  then  st  <»  right(p)  fi.?  /*push  onto  st*/ 

IT  left(p)  /  NIL  then  st  <■  left(p)  fi 
od 

{seq-PREORDER  (r)  } 

The  function  PREORDER (r)  appearing  in  the  postcondit ion  is  the 
sequence  consisting  of  the  preorder  traversal  of  the  binary  tree 
with  root  node  r.  Let  seq  be  the  key  variable.  Similar  reason¬ 
ing  to  that  employed  in  the  previous  example  indicates  here  that 
the  loop  is  uniformly  implemented  wrt  |  | ,  where  the  sets  C  and  Z 
are  the  set  of  all  sequences.  It  is  a  Type  A  loop.  The  specifi¬ 
cation  function  is 

f(<NOLL,  (r)>)  *  <  PREORDER  (r)  ,  st"> . 

Again#  the  "  notation  is  used  to  represent  the  final  values  of 
variables  that  are  of  no  interest.  Applying  Corollary  5.3-1  we 
obtain 

f'(<seq#  (r)>)  -  <seq|  jPRBDRDER  (r)  #  st '> 
as  a  valid  generalization  of  f.  In  this  case#  £'  is  not  adequate 
since  it  does  not  specify  a  behavior  of  the  loop  for  arbitrary 
values  of  the  stack  st.  We  will  return  to  this  example  after 
considering  another  subclass  of  uniformly  implemented  loops. 

Definition  5.8  -  A  uniformly  implemented  loop  satisfying 
“B  (X)  ->  c  f  X  6  D(  f) 
is  a  Type  B  loop. 

The  intuition  behind  a  Type  B  uniformly  implemented  loop  is 
as  follows.  Whenever  an  execution  of  a  Type  B  loop  terminates 
(i.e.  “B (X)  holds)  and  the  resulting  data  state  is  modified  by 
the  operator  $#  the  result  is  a  new  data  state  which  is  a  "valid" 
starting  point  fbr  a  new  execution  of  the  loop  (i.e.  this  new 
state  is  in  D(f)).  This  property  is  reflected  in  the  following 
corollary. 

Corollary  5.3-3  -  Let  P  be  a  Type  B  loop.  If  £'  is  the  ex- 
tension  of  t  de"?ine<I  by 

(5.15)  {  (c  $  X#f(c  $  f(X)))  |  X  8  D(f)|, 

then  £'  is  a  valid  generalization  of  f. 

Proof  -  The  proof  consists  of  showing  that  (5.13)  and  (5.15) 
are  equivalent  for  a  Type  B  loop  which  is  correct  wrt  f.  By 
Theorem  5.3#  the  corollary  then  holds.  Let  P  be  a  Type  B  loop 
which  is  correct  wrt  f.  A  consequence  of  the  correctness  proper¬ 
ty  is  that  “B  ( f  ( X) )  for  all  X  6  D(f).  Since  P  is  a  Type  B  loop# 
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this  implies  c  $  f(X)  6  D(f).  Thus  c  $  f(X)  6  D(g)  and  g(c  $ 
f  (X)  )-f  (c  $  £ ( X) ) .  Consequently  (5.13)  and  (5.15)  are 

equivalent. 

As  before,  a  general  case  of  this  corollary  can  be  stated 
which  corresponds  to  an  arbitrary  number  o£  its  applications. 

Oorollary  5.2~4  -  Let  P  be  a  Type  B  loop.  If  f"  is  the  ex¬ 
tension  of  f  define?  by 

{  (cl$ (c2$ (. . .$ (cn$X) ...)), f (cl$f (c2$f ( . . .$f (cn$f (X) )...)))) | 

X  6  0(f)  &  n>-0} , 

then  £'  is  a  valid  generalization  of  f. 

Example  5.5  (continued)  -  We  now  consider  the  problem  of 
further  generalizing  the  derived  specification  in  the  previous 
example.  The  variable  for  which  the  loop  is  not  closed,  st,  will 
now  be  the  key  variable.  Consider  an  operation  c  $'  st  that  has 
the  effect  of  adding  an  element  c  to  the  stack  st.  Before  being 
more  precise  about  this  operation,  we  consider  how  the  loop  body 
works,  and  how  its  output  depends  on  the  value  of  the  key  vari¬ 
able  st. 

We  observe  that  the  loop-body  behavior  relies  heavily  on  the 
characteristics  of  the  node  on  the  top  of  the  stack.  Consequent¬ 
ly,  a  modification  c  $"  st  to  st  which  pushed  a  new  node  c  onto 
the  top  of  st  would  not  cause  a  slight  and  orderly  change  in  the 
result  produced  by  the  loop  body  and  the  uniformity  conditions 
(5.5) -(5.7)  would  not  hold.  However,  because  the  loop -body 
behavior  seems  to  be  independent  of  what  lies  underneath  the  top 
of  the  stack,  we  suspect  the  loop  is  uniformly  implemented  wrt 
ADDUNDER,  where  C  is  the  set  of  binary  tree  nodes,  Z  is  the  set 
of  stacks  of  binary  tree  nodes,  and  c  ADDUNDER  st  is  the  stack 
that  results  from  adding  c  to  the  bottom  of  st.  Conditions 
(5.5)-(5.7)  for  this  operation  indicate  that,  indeed,  this  is  the 
case. 

Let  f  be  the  generalization  £'  from  the  previous  example. 
In  keeping  with  the  convention  described  above,  since  st  is  now 
the  key  variable,  we  will  reverse  the  order  in  which  the  two 
variables  appear  in  the  data  state,  i.e.  we  will  write  <st,seq> 
instead  of  <seq,st>. 

The  program  is  a  Type  B  uniformly  implemented  loop  since 
st -EMPTY  ->  <c  ADDUNDER  8t,seq>  9  D(  f) 
where  c  is  a  node  of  a  binary  tree,  and  specifically 
(5.16)  st -EMPTY  ->  f ( <c  ADDUNDER  st,seq>) 

-<st',  seq|  | PREORDER  (c)  >  . 

Applying  Corollary  5.3-4,  if  (r,cn,  ...  ,cl)  is  an  arbitrary 
stack  (with  r  on  top,  cl  on  the  bottom,  n>-0) 
f'(<(r,cn,  ...  ,cl),seq>)  - 
f "(cl$ (c2$ (  ...  $ (cn$<(r) ,seq>)  ...  )))  - 
f (cl$f (c2$f (  ...  $f (cn$f (<(r) ,seq>) )  ...  )))  - 
f(cl$f(c2$f(  ...  $f (cn$<st',seq |  f PREORDER (r)>)  ...  ))). 
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Recall  that  at'  refers  to  the  final  value  of  st  coig>uted  by  the 
loop.  The  loop  predicate  indicates  this  will  always  be  the  value 
EMPTY.  Hence  (5.16)  can  be  applied  to  this  expression  from  in¬ 
side  out  giving 

f(cl$f  (c2$f  (...$<st\  seq|  | PRBORDER  (r)  |  | PRBORDER  (cn)  >...))) 

*  e  e  e  * 

<st",seq|  I  PRBORDER  (r)  |  |PRBORDER  (cn)  |  |  ...  |  |PRBORDBR  (cl)  >  . 

Note  that  f“  now  defines  a  loop  behavior  for  all  sequences  seq 
and  nonempty  stacks  st.  The  base  generalization  of  f  "  supple¬ 
ments  f"  with  a  behavior  for  the  empty  stack  st  and  is  thus  an 
adequate  generalization. 

Example  5.6  [Gries  79]  -  The  following  program  confutes 

Ackermanns  “function  using  a  sequence  variable  s  of  natural 
numbers.  The  notation  s(l)  is  the  rightmost  element  of  s  and 
s(2)  is  the  second  rightmost,  etc.  The  sequence  s(..3)  is  s  with 
s(2)  and  s(l)  removed. 


<s(l)  +  l> 

<s  (2)  -1  ,1> 

<s  (2)  -1  ,s(2) ,  s(l)  -1>  fi 


The  function  A(m, n)  appearing  in  the  postcondition  in  Ackermann's 
function.  The  specification  function  is 

f(<s(2) , s(l)>)*<A(s(2) rs(l) )> . 

Let  s  be  die  key  variable.  As  the  loop-body  behavior  is  indepen¬ 
dent  of  the  leftmost  portion  of  s,  the  loop  is  uniformly  imple¬ 
mented  wrt  | ,  where  C  is  the  set  of  natural  numbers,  Z  is  the  set 
of  nonempty  sequences  of  natural  numbers,  and  c|s  *  <c>||s.  The 
program  is  also  a  Type  B  loop.  By  Corollary  5.3-4  (where  n>l), 
f"(<s(n)  ,s(n-l) ,  ...  ,s(l)>)  - 

f"(s(n)$(s(n-l)$(  ...  $  (s(3)$<s(2),s(l)>)  ...  )))  - 
f (s(n) $f (s(n-l)$f (  ...  $f(s(3)$f(<s(2),s(l)>))  ...  )))  - 
f (s(n)$f (s(n-l)$f (  ...  $f(s(3)$<A(s(2),s(l)>)  ...  )))  - 
f (s(n) $f (s(n-l)$f (  ...  $f(<s(3),A(S(2),s(l))>)  ...  )))  - 
f(s(n)$f(s(n-l)$f(  ...  $<A(s(3),A(s(2),s(l)))>  ...  ))) 

®  e  e  e  ® 

<A(s(n) ,A(s(n-l) ,  ...  ,A (s(3) ,A (s(2) , s(l) ) )  ...  ))> 
is  a  valid  generalization  of  f.  As  in  the  previous  example,  the 
base  generalization  of  this  function  is  adequate. 

5.4.  Simplifying  the  *  Iteration  Condition" 

The  view  of  WHILE  loop  verification  presented  here  is  one  of 
a  two  step  process,  the  first  step  being  the  discovery  of  an  ade¬ 
quate  valid  generalization  £'  of  the  loop,  specification  f,  die 
second  being  the  proof  of  3  basic  conditions  (i.e.  (5. 2) -(5. 4)) 
based  cn  tiiis  generalization,  we  have  seen  that  the  uniform  na¬ 
ture  of  a  loop  implementation  may  be  used  in  the  first  step  as  an 


{s-<m,n>  ,m>«0,n>«0} 
while  size(s)  ^  1  do 
IF  s(2)«0  tEen  s 
eTseif  s(l)*0  then  s 
else  s 

Od“ 

{s*<A(m,  n)  > } 


■s  ( . .  3) 
■s  ( . .  3 ) 
■s(.  .3) 
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aid  in  discovering  an  appropriate  generalization.  In  this  sec¬ 
tion,  vie  will  exploit  the  same  loop  characteristic  to  substan¬ 
tially  simplify  one  of  the  conditions  which  must  be  proven  in  the 
second  step  of  this  process. 

The  verification  condition  of  interest  is  (5.4)  above,  i.e. 

X  e  D(  f  ■*)  &  B  (X)  ->  f  '(X)«f  '(H  (X)  )  , 
and  is  labeled  the  iteration  condition  in  [Misra  78]  .  This  con¬ 
dition  assures  that  as  the  loop  executes,  the  intermediate  values 
of  X  remain  in  the  same  level  set  of  £',  i.e.  the  value  of  £'  is 
constant  across  the  loop  iterations.  Previously  we  argued  that 
if  P  is  uniformly  implemented  wrt  $",  a  change  in  the  key  vari¬ 
able  by  $'  causes  a  slight  but  orderly  change  in  the  result  pro¬ 
duced  by  H.  Roughly  speaking  then,  the  behavior  of  H  is  largely 
independent  of  the  key  variable.  If  f'  is  chosen  so  as  to  be 
equally  independent  of  the  key  variable,  and  the  above  condition 
holds  fOr  X=<z,y>  where  y  is  arbitrary  but  the  key  variable  z  has 
a  specific  simple  value,  we  might  expect  the  condition  to  hold 
for  all  X.  Such  an  expectation  would  be  based  on  the  belief  that 
the  truth  or  falsity  of  this  condition  would  also  be  largely  in¬ 
dependent  of  the  key  variable. 

we  formally  characterize  this  circumstance  in  the  following 
definition. 

Definition  5.9  -  Let  P  be  a  loop  of  the  form  described 
above.  A  generalization  f"  of  f  is  represented  by  f  iff 

(5.17)  X  e  D(f  )  &  B(X)  ->  f  (X)-f'(H(X)) 

-> 

(5.18)  X  6  D(  f ")  &  B(X)  ->  f  '(X)»f  '(H(X)  )  . 

Thus  if  £'  is  represented  by  f,  condition  (5.17)  can  be  used 
in  place  of  the  iteration  condition  (5.18)  in  proving  the  loop  is 
correct  wrt  £'  (and  hence  wrt  f )  .  The  significance  of  this  si¬ 
tuation  is  that  the  iteration  condition  can  be  tested  with  the 
key  variable  constrained  by  initialization  (as  prescribed  in 
D(  f) ) .  In  practice,  the  result  is  one  of  having  to  prove  a  sub¬ 
stantially  simpler  verification  condition. 

The  following  theorems  state  that  the  use  of  Corollaries 
5.3-2  and  5.3-4  lead  to  generalizations  which  are  represented  by 
the  original  specification. 

Theorem  J.4  -  Let  P  be  a  Type  A  loop.  Suppose  £'  is  the 
valid  generalization  of  f  defined  in  Corollary  5.3-2  and  suppose 
£'  is  adequate.  Then  f'  is  represented  by  f. 

Proof  -  Suppose  (5.17)  holds  and  select  some  arbitrary  X' 
from  “dTP)  satisfying  B(X'>).  Thus  there  exists  cl,  ...»  cn  e  C, 
n>«0  and  X  e  D(f)  such  that 

X"*  cl$ (c2$ (. . .$ (cn  $  X  )...)*. 

By  the  definition  of  a  Type  A  loop,  we  must  have  B  (X) .  Applying 
the  definition  of  £'  yields 
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f'(X>)»cl$  (c2$  (...$  (cn$f  (X  )...)) 
which  ia 

«cl$  (c2$  (. .  .$  (cn$f  '(E  (X) ) . . . ) ) 

by  (5.17)  since  B  (X)  holds.  Since  H(X)  e  D(f')  (since  £'  is  ade¬ 
quate),  there  exists  dl,  . ..,  dm  6  C,  m>»0,  and  XI  e  D(f)  such 
that 

H (X)  *dl$ (d2$ (. . .$ (dm$  XI  )...)). 

Furthermore, 

£' (H  (X) )  »dl$ (d2$ ( . . .$ (dm$f (XI) )...)). 

Hence,  continuing  from  above 

f '(X')»  cl$  (. .  .$  (cn$  (dl$  (. .  .$  (dm$f  (XI) )...)))...) 
which  is 

-f'(cl$(...$(cn$(dl$(...$(dm$  Xl  )...)))...)) 
from  the  definition  of  f".  Thus 
f'(X')  •£'  (cl$  (. .  .$  (cn$H  (X) ) . . .  > ) 
which  is 

»f>(H(cl$(...$(cn$  X  )...))) 
from  the  uniformity  condition  (5.8).  Hence 
£'(X')~f'(H  (X') ) 
and  the  theorem  is  proved. 

Theorem  J.5  -  Let  P  be  a  Type  B  loop.  Suppose  f“  is  the 
valid  generalization  of  f  defined  in  Corollary  5.3-4  and  suppose 
f'  is  adequate.  Then  £'  is  represented  by  f. 

Proof  -  Suppose  (5.17)  holds  and  select  some  arbitrary  X' 
from  D(’f "*)  satisfying  BfX").  Thus  there  exists  cl,  ...,  cn  e  C, 
n>*0  and  X  6  D(f)  such  that 

X'«  cl$  (c2$  (. .  .$  (cn  $  X  )...)). 

We  make  the  assunption  that  B(X).  Otherwise,  by  the  definition 
of  a  Type  B  loop,  the  term  cn  $  X  can  be  replaced  by  another  X  6 
D(f).  Since  B  (X“) ,  this,  process  can  be  continued  until  X"  is 
written  in  the  form  above,  with  X  6  D(f)  and  B  (X) .  Applying  the 
definition  of  £'  yields 

f'(X')»f (cl$f (c2$f(...$f (cn$f  (  X  ))...))) 
which  is 

-f  (clSf  (c2$f  (. .  .$f  (cn$f  ■'(H  (X) ))...)) ) 
by  (5.17)  since  B(X)  holds.  Since  H(X)  e  D(f')  (since  £'  is  ade¬ 
quate),  there  exists  dl,  ...,  dm  6  C,  m>*0,  and  Xl  6  D(f)  such 
that 

H (X)  »dl$ (d2$ (. . .$ (dm$Xl) ...)). 

Furthermore, 

V (H (X) ) -f ( dl$f ( d2$f ( . . . $f ( dm$f (Xl) )...))) . 

Hence,  continuing  from  above 
f'(X')-f(cl$f(...$f(cn$f (dl$f <...$f (dm$f (Xl) )...)) )...)) 
which  is 

•£'  (cl$ (. . .$ (cn$ (dl$ (. . .$ (dra$Xl) ...)))...)) 
from  the  definition  of  £'.  Thus 
f '(X')«f  '  (cl$  (. .  .$  (cn$H  (X) ) . . . ) ) 
which  is 

-f'(H(cl$(...$(cn$  X  )...))) 
from  the  uniformity  condition  (5.8).  Hence 
f'(X')»f'(H(X')) 


and  the  theorem  is  proved 


Example  J.7  -  Consider  the  exponentiation  program  of  Example 

5.4.  The  generalization  obtained  from  Corollary  5.3-2  is 

f'(<w,e,d>)-<w*(e*d)  ,e',d'>, 

where  e>0,  d>«0.  Since  £'  is  represented  by  f,  the  iteration 
condition  corresponding  to  (5.17) 

e>0  &  d>0  &  odd(d)  ->  eAd  ■  e*  ( (e*e)  *  (d/2) )  & 

e>0  &  d> 0  &  “ odd ( d)  ->  eAd  =*  1*  ( (e*e)  A  (d/2) ) 

can  be  used  in  place  of  that  corresponding  to  (5.18) 

e>0  &  d>0  &  odd(d)  ->  w*(eAd)  -  (w*e)  *  ( (e*e)  A  (d/2) )  & 

e>0  &  d>0  &  ~ odd ( d)  ->  w*(e*d)  *  w*  ( (e*e)  *  (d/2) ) . 

The  benefits  of  this  simplification  are  more  striking  for  more 
complex  types  of  key  variables.  To  illustrate,  consider  the  pro¬ 
gram  to  compute  Ackermann's  function  in  Example  5.6.  Applying 
Corollary  5.3-4  to  the  base  generalization  g  of  f  yields  the  gen¬ 
eralization  defined  by 

n>l  ->  f  “( <s  (n) ,  s(n-l) , . . . ,  s(l)> )  * 

<A(s(n)  ,  A  ( s  ( n— 1 ) , .  •  •  »A  (s(3)  ,A(s(2) ,  s(l) ) ) ...  )  )> 

and 

f'(<s(l)>)»<s(l)>. 

Since  £'  is  represented  by  g,  the  iteration  condition 
s(2)*0  ->  <A(s(2),s(l))>*<s(l)+l>  & 

s(2)/0  &  s(l)*0  ->  <A(s(2),s(l) )>*<A(s(2) -1,1) >  & 

s(2)/0  &  s(l)/0  ->  <A(s(2),s(l))>*<A(s(2)-l,A(s(2),s(l)-l))> 
can  be  used  in  place  of 
n>l  &  s(2)*0  -> 

<A(s(n)  ,A (s( n-1) , ... ,A(s( 3) ,A(s(2) , s(l) ))...) )>» 

<A(s(n)  ,A  (s(n-l) , . . .  ,A(s(3)  ,s(l)+l)  ...))>  fi 

n>l  &  s(2)/0  &  s ( 1) ** 0  -> 

<A(s(n)  , A  ( s ( n— 1) ,  •  • .  ,A  ( s ( 3 )  , A  ( s ( 2 ) ,  s ( 1 ) ) )  •  • .)  )>* 

<A(s(n)  ,A  (s(n-l) , . . . ,A (s(3) ,A (s( 2) -1 ,1) )...)) >  & 

n> 1  &  s(2)/0  &  s(l)/0  -> 

<A(s(n)  ,A  (s(n-l) ,..., A  (s(3),A(s(2),s(l))). ..))>** 

<A(s(n)  ,A (s( n-1) ,..., A (s(3),A(s(2)-l, A (s(2),s(l)-l )))...))>. 

5.5.  Recognizing  Uniformly  Implemented  Loops 

Although  the  problem  of  recognizing  uniformly  implemented 
loops  is  in  general  an  unsolvable  problem,  the  following  guide¬ 
lines  seem  useful  in  a  large  number  of  situations. 

Recognizing  uniformly  implemented  loops  can  be  viewed  as  a 
search  for  an  operation  wrt  which  the  loop  is  uniformly  imple¬ 
mented.  In  practice,  condition  (5.5)  is  the  most  demanding  con¬ 
straint  on  this  operation.  An  effective  strategy,  therefore,  is 
to  use  (5.5)  as  a  guideline  to  suggest  candidate  operations. 
Conditions  (5.6)  and  (5.7)  must  be  proven  to  show  the  loop  is  un¬ 
iformly  implemented  wrt  some  particular  candidate. 

Often  the  modification  to  the  key  variable  z  in  the  loop 
body  is  performed  by  a  statement  of  the  form 
z  z  #  e(y) 


for  some  dyadic  operation  #  and  function  e.  In  this  case,  condi¬ 
tion  (5.5)  suggests  the  loop  may  be  uniformly  implemented  wrt  I 
or  some  directly  related  operation.  For  example,  if  #  is  associ¬ 
ative,  condition  (5.5)  holds  fOr  #.  If  t  satisfies 
(a  #  b)  #  c  *  (a  #  c)  #  b 

(e.g.  subtraction),  and  an  inverse  t'  of  f  exists  satisfying 
a  #  b  *  c  <->  b  #"  c  »  a 

(e.g.  addition  if  #  is  subtraction),  condition  (5.5)  holds  for 


Another  commonly  occurring  case  is  when  the  future  values  of 
the  key  variable  z  are  independent  of  y,  i.e. 
h'(z,yl)  *  h  "  ( z ,  y2 ) 

for  all  z,  yl  and  y2.  This  situation  arises  most  frequently  when 
z  is  some  data  structure  which  varies  dynamically  as  the  loop 
iterates.  Typically,  there  exists  some  particular  aspect  or  por¬ 
tion  of  the  data  structure  (e.g.  the  top  of  a  stack,  the  end  of  a 
sequence,  the  leaf  nodes  in  a  tree)  which  guides  its  modifica¬ 
tion.  A  useful  heuristic  which  can  be  employed  in  this  cir¬ 
cumstance  is  to  consider  only  operations  which  maintain  (i.e. 
keep  invariant)  this  particular  aspect  of  the  data  structure. 
Selecting  such  an  operation  $"  guarantees  that  the  "change"  ex¬ 
perienced  by  the  data  structure  in  the  loop  body  will  be  indepen¬ 
dent  of  any  modification  $'  and  thus  insures  condition  (5.5) 
holds. 


In  any  case,  recognizing  uniformly  implemented  loops  and 
determining  the  operation  wrt  which  they  are  uniformly  implement¬ 
ed  is  often  facilitated  if  the  intended  effect  of  the  loop  body 
(as  regards  the  key  variable)  is  documented  in  the  program  source 
text.  Such  documentation  abstracts  what  the  loop  body  does  from 
the  method  employed  to  achieve  this  result  and  thus  makes 
analysis  of  the  loop  as  a  whole  easier. 

To  illustrate,  consider  the  following  program  to  compute  the 
maximum  value  in  a  subarray  a[i..n]  of  natural  numbers: 

{m=0} 

while  i  <■  n  do 

IF  m  <  a[iT” then  m  :*  a[i]  fij 
i  +  1 
od 

{m*MAX IMUM ( a,  iO,n) } . 

If  the  effect  on  m  in  the  loop  body  were  documented  as 
m  :»  MAX (m, a[ i] ) , 

its  updating  would  be  of  the  form  m  :■  m  #  a[i]  and  the  heuristic 
discussed  above  could  be  employed  to  help  determine  that  the  loop 
is  uniformly  implemented  wrt  #  ■  MAX. 
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J.6.  Related  Work 

The  first  work  on  generalizing  functional  specifications  for 
loops  appears  in  (Basu  &  Misra  76]  .  These  results  are  refined  in 
[Misra  78]  and  are  studied  in  considerable  detail  in  [Misra  79]  . 
The  major  contribution  of  this  research  seems  to  be  the  identifi¬ 
cation  of  two  loop  classes  or  schemas  which  are  "naturally  prov¬ 
able.”  The  first  class  is  called  the  accumulating  loop  schema 
and  can  be  viewed  as  a  (commonly  occurring)  spec ial  case  of  the 
Type  A  loops  discussed  above.  Specifically,  a  program  in  the  ac¬ 
cumulating  loop  schema  with  associative  binary  operation  $"  in 
the  sense  of  [Basu  &  Misra  76]  is  necessarily  uniformly  imple¬ 
mented  wrt  $'  and  meets  the  criterion  for  a  Type  A  loop  presented 
above. 

The  second  of  these  classes  is  called  the  structured  data 
schema.  A  loop  in  this  class  is  uniformly  implemented  wrt  an 
operator  trfiich  adds  an  element  to  the  data  structure  being  pro¬ 
cessed  in  such  a  way  that  it  is  not  the  "next"  element  to  be  re¬ 
moved  from  the  structure  (e.g.  recall  the  use  of  ADDUNDER  in  the 
tree  traversal  example).  A  loop  in  this  class  necessarily  meets 
the  criterion  for  a  Type  B  loop  presented  above.  The  program  to 
compute  Ackermann's  function  does  not  fit  in  the  structured  data 
schema.  We  remark  that  the  analysis  presented  in  this  chapter 
relies  on  the  loop  body  conputing  a  function,  i.e.  it  relies  on 
the  loop  body  being  deterministic.  Consequently,  the  above  com¬ 
ments  do  not  apply  to  title  non-determinist ic  structured  data  loops 
analyzed  in  [Misra  79]  . 

< 

In  [Misra  79]  the  author  states  that  the  important  common 
feature  between  these  program  classes  is  that  "  ...  they  act  upon 
data  in  a  "uniform"  manner;  changes  in  the  input  data  lead  to 
certain  predictable  changes  in  the  result  obtained."  The  work 
described  in  this  chapter  can  be  viewed  as  an  attenpt  to  charac¬ 
terize  this  commonality  and  to  generalize  the  work  in  [Misra  79] 
baaed  on  this  characterization. 

More  recently,  [Basu  80]  considers  the  problem  of  generaliz¬ 
ing  loop  specifications  and  uses  the  idea  of  a  loop  being  "uni¬ 
form  over  a  linear  data  domain.”  One  difference  between  this 
work  and  our  notion  of  a  uniformly  implemented  loop  is  that  Basu 
considers  only  programs  in  the  accumulating  loop  schema  (in  the 
sense  of  [Basu  &  Misra  76]  without  the  closure  requirement) . 
More  importantly,  Basu's  idea  of  uniform  behavior  is  based  an  the 
behavior  of  the  loop  as  a  whole  and  seems  to  be  largely  indepen¬ 
dent  of  the  loop  body.  Our  approach  relies  solely  on  the  charac¬ 
teristics  of  the  loop  body. 

Misra  points  out  in  [Misra  78,  Misra  79]  that  the  iteration 
condition  for  his  structured  data  schema  can  be  simplified  in  a 
manner  similar  to  that  presented  here;  our  results  show  that  the 
same  simplification  can  be  applied  to  his  accumulating  loop  sche¬ 
ma.  Again,  an  appropriate  view  of  the  research  described  in  this 
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chapter  is  one  of  generalizing  this  earlier  work  by  investigating 
the  theory  which  underlies  these  phenomenon. 

5.7.  Discussion 


It  is  felt  that  a  critical  aspect  of  reading,  understanding 
and  verifying  program  loops  is  generalizing  the  behavior  of  a 
loop,  over  a  restricted  set  of  inputs  to  that  over  a  more  general 
set  of  inputs.  The  view  of  this  generalization  process  presented 
in  this  chapter  is  one  of  ascertaining  how  changes  in  values  of 
particular  input  variables  affect  the  subsequent  confutation  of 
the  loop.  This  process  is  facilitated  if  these  changes 
correspond  to  particularly  simple  modifications  in  the  result 
produced  by  the  loop  body. 

Of  course,  the  simplest  possible  modifications  in  the  result 
produced  by  the  loop  body  would  be  no  modifications  at  all,  i.e. 
the  output  of  the  loop  body  (and  hence  the  loop)  is  conpletely 
independent  of  changes  in  these  input  variables.  This  situation, 
however,  occurs  rarely  in  practice  since  it  implies  that  the  in¬ 
put  values  of  these  variables  serve  no  purpose  in  view  of  the  in¬ 
tended  effect  of  the  loop.  It  is  felt  that  the  definition  of  a 
uniformly  implemented  loop  presented  above  is  the  "next  best"  al¬ 
ternative,  and  yet  a  large  number  of  commonly  occurring  loops 
seem  to  possess  this  property.  The  definition  states  that  in 
terms  of  the  execution  of  the  loop  body,  prescribed  changes  in 
the  input  value  of  the  key  variable  affect  only  the  final  value 
of  the  key  variable;  all  other  final  values  are  independent  of 
the  change.  Just  as  importantly,  the  modification  caused  in  the 
final  value  of  the  key  variable  is  necessarily  the  same  as  the 
change  in  its  corresponding  input  value.  This  property  is  analo¬ 
gous  to  that  possessed  by  a  function  of  1  variable  with  unit 
slope  in  analytic  geometry;  increasing  the  input  argument  by 
some  constant  causes  the  function  value  to  be  increased  by  exact¬ 
ly  the  same  quantity.  TaI.en  together,  these  factors  account  for 
the  pleasing  symmetry  between  $  and  H  in  condition  (5.8). 

Viewed  as  a  verification  technique  for  uniformly  implemented 
loops,  the  procedure  described  above  can  be  thought  of  as 
transforming  the  problem  of  discovering  the  general  loop  specifi¬ 
cation  into  the  problem  of  discovering  the  operation  with  respect 
to  which  the  loop  is  uniformly  implemented.  Clearly,  this  is  of 
no  benefit  if  the  latter  is  no  easier  to  solve  than  the  former. 
In  many  cases,  however,  it  seems  that  simple  syntactic  checks  are 
sufficient  for  identifying  this  operation.  For  example,  in  the 
tree  traversal  program,  the  fact  that  the  loop  body  does  not  test 
the  stack  for  emptiness  (Basu  a  Misra  76]  is  a  sufficient  condi¬ 
tion  for  the  loop  being  uniformly  implemented  with  respect  to 
ADD0NDER. 

It  is  felt  that  the  notion  of  uniformly  implemented  loops 
may  have  an  application  in  the  program  development  process. 
Specifically,  when  designing  an  initialized  loop  to  compute  some 
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function,  the  programmer  should  attempt  to  construct  the  loop  in 
such  a  way  that  it  is  uniformly  implemented  with  respect  to  some 
easily  stated  operation.  Our  work  indicates  that  these  loops  are 
susceptible  to  a  rather  routine  form  of  analysis.  Furthermore, 
implementing  a  loop  in  a  uniform  fashion  requires  maintaining  a 
certain  amount  of  independence  between  program  variables  (or 
perhaps  portions  of  program  variables  in  the  case  of  structures) 
and  a  simple  dependence  between  the  input/output  values  computed 
by  the  loop  body.  Such  programs  are  desirable  since  the  ease 
with  which  a  loop  can  be  understood  depends  largely  on  the  com- 
plexity  of  the  interactions  and  interconnections  among  program 
variables.  He  remark  that  the  question  of  whether  a  given  pro¬ 
gram  is  "well  structured"  has  been  viewed  largely  as  a  syntactic 
issue  (e.g.  use  of  a  restricted  set  of  control  structures}  j  we 
offer  the  definition  of  a  uniformly  implemented  loop  as  an  at¬ 
tempt  at  a  characterization  of  a  semantically  well  structured 
program. 


The  purpose  of  this  research  has  been  an  investigation  of  a 
number  of  issues  involved  with  the  application  of  the  functional 
correctness  methodology.  In  Chapter  2 ,  we  defined  a  functional 
correctness  technique  and  addressed  the  issue  of  the  strategy  em¬ 
ployed  in  decooposing  cosposite  programs  in  the  proof  process. 

Chapters  3-5  dealt  largely  with  the  issue  of  verifying  pro¬ 
grams  containing  loops  with  undocumented  or  inadequate  intended 
functions.  In  Chapter  3,  the  idea  of  a  reduction  hypothesis  was 
introduced.  The  use  of  a  reduction  hypothesis  in  a  proof  of 
correctness  eliminates  the  need  for  the  intended  loop  function. 
As  an  alternative  approach,  a  heuristic  procedure  for  construct¬ 
ing  intended  loop  functions  was  described  in  Chapter  4.  Finally, 
in  Chapter  5,  a  class  of  loops  was  described  for  which  inadequate 
intended  loop  functions  can  be  extended  or  generalized  in  a  sys¬ 
tematic  manner. 

The  most  promising  directions  for  future  research  seem  to 
stem  from  the  results  presented  in  Chapter  5.  While  each  uni¬ 
formly  implemented  loop  we  have  studied  does  exhibit  a  "reason¬ 
able"  form  of  behavior,  there  do  exist  "reasonable"  initialized 
loop  programs  which  do  not  satisfy  our  definition.  This  situa¬ 
tion  invariably  occurs  when  initialized  data  is  used  in  an  in¬ 
direct  manner  (e.g.  as  an  array  index  or  pointer).  A  characteri¬ 
zation  of  uniform  loop  behavior  with  respect  to  indirect  data 
would  serve  as  a  useful  coup  lament  to  the  results  described  in 
Chapter  5. 

A  second  direction  for  future  research  is  the  development  of 
a  program  design/implementation  methodology  based  on  the  uniform¬ 
ly  implemented  loop.  At  the  point  of  designing  an  initialised 
loop  program,  the  methodology  would  suggest  the  programmer  first 
formulate  an  appropriate  operation,  and  then  proceed  to  detail 
the  loop  in  a  way  which  insured  that  it  was  uniformly  implemented 
with  respect  to  that  operation.  We  argued  previously  that  uni¬ 
formly  implemented  loops  are  "well  structured"  programs;  such  a 
methodology  might  be  a  valuable  contribution  to  what  has  cosm  to 
be  known  as  "structured  programming.” 
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